ChoritroAI / README.md

Update README.md

8de8d92 verified 7 months ago

3.81 kB

	---
	license: mit
	language:
	- bn
	tags:
	- image-classification
	- handwritten-character-recognition
	- bangla
	- computer-vision
	- efficientnet
	model-index:
	- name: Bangla Handwritten Character Recognition
	results: []
	---

	# Model Card for Bangla Handwritten Character Recognition

	This is a Bangla handwritten character recognition model that can detect 195 classes, including compound characters. The model is designed for accurate offline recognition of Bangla handwritten script.

	## Model Detail
	Architecture: EfficientNetV2S (feature extractor)

	Custom Layers: Triple-head architecture with two attention mechanisms, followed by a soft average ensemble layer.

	Parameters:
	Total: 30,912,695

	Framework: TensorFlow / Keras

	Finetuned from: EfficientNetV2S pretrained on ImageNet

	Developed by: Team Segfault

	License: MIT


	### Model Description

	This model is a deep learning-based handwritten character classifier designed specifically for the Bangla language. It recognizes 195 different Bangla characters, including basic vowels and consonants as well as compound letters.

	Built using the EfficientNetV2S architecture as a feature extractor, the model applies a custom triple-head fully connected (FC) architecture with two attention mechanisms to enhance discriminative power. The outputs from all three heads are averaged using a soft-voting strategy to produce the final prediction, improving generalization and reducing overfitting.

	The goal of this model is to support accurate offline recognition of handwritten Bangla text — useful for educational tools, digitization of documents, and OCR applications focused on Bangla script. It achieves high accuracy on a balanced dataset, thanks to a carefully designed data augmentation pipeline that includes ElasticTransform and random rotation.

	This model was trained on the MatrivashaBangla dataset(a newly combinded dataset), containing over 500,000 samples. The entire training and evaluation pipeline is implemented using TensorFlow/Keras.


	- Developed by: [Team Segfault]
	- Model type: [Deep Neural Network]
	- License: [MIT]
	- Finetuned from model : [EfficientNetV2S]


	## Uses
	1. Offline OCR systems for Bangla script
	2. Educational software or Bangla digitization tools
	3. Document recognition systems, etc

	### Out-of-Scope Use
	1. Not intended for real-time inference on low-power devices
	2. Not designed for other scripts or languages

	## Training Details

	### Training Data
	Dataset
	Name: MatrivashaBangla (a newly combinded dataset using BanglaLekha-Isolated and Matrivasha_raw)

	Size: 195 classes, ~500,000 images

	Augmentations: ElasticTransform, Rotate

	Balance: Class-balanced via custom augmentation scripts (2500-2580 images per class)


	## Evaluation

	Train Accuracy: ~97.59%

	Validation Accuracy: ~96.69%

	Metrics: Accuracy, F1-score, Precision, Recall

	Evaluation Dataset: Test split from MatrivashaBangla (10%), Accuracy ~96.60%

	### Model Architecture
	EfficientNetV2S (Frozen Layers)
	↓
	Feature Output (Shared)
	↓
	GlobalAvaragePooling
	↓
	├── FC Head 1 with Attention
	├── FC Head 2 with Attention
	└── FC Head 3 (Baseline)
	↓
	Soft Average of 3 Head Outputs
	↓
	Final Prediction (195 classes)


	## Limitations
	1. May misclassify extremely poor handwriting or characters written with noisy backgrounds
	2. Not tested for real-time edge deployment
	3. Only trained on standard handwritten script — no cursive, artistic, or stylized forms

	## Environmental Impact
	- Compute Region: [Local training, not on cloud]
	- Carbon Emitted: [ ~5–10 kg CO2]

	## Model Card Contact
	Author: Meharaz Hossain

	Email: meharaz733@gmail.com

	GitHub: https://github.com/meharaz733

	Hugging Face: https://huggingface.co/meharaz733