IsmatS
/

handwriting-recognition-iam

handwriting-recognition

computer-vision

Eval Results (legacy)

Model card Files Files and versions

handwriting-recognition-iam / README.md

IsmatS's picture

Upload README.md with huggingface_hub

eba9c9b verified 3 months ago

|

history blame contribute delete

3.57 kB

	---
	language:
	- en
	license: mit
	tags:
	- handwriting-recognition
	- ocr
	- computer-vision
	- pytorch
	- crnn
	- ctc
	- iam-dataset
	library_name: pytorch
	datasets:
	- Teklia/IAM-line
	metrics:
	- cer
	- wer
	model-index:
	- name: handwriting-recognition-iam
	results:
	- task:
	type: image-to-text
	name: Handwriting Recognition
	dataset:
	name: IAM Handwriting Database
	type: Teklia/IAM-line
	metrics:
	- type: cer
	value: 0.1295
	name: Character Error Rate
	- type: wer
	value: 0.4247
	name: Word Error Rate
	---

	# Handwriting Recognition

	Complete handwriting recognition system using CNN-BiLSTM-CTC on the IAM dataset.

	## 📁 Files

	### 1. analysis.ipynb - Dataset Analysis
	- Exploratory Data Analysis (EDA)
	- 5 detailed charts saved to `charts/` folder
	- Run locally or on Colab (no GPU needed)

	### 2. train_colab.ipynb - Model Training (GPU)
	- ⚡ Google Colab GPU compatible
	- Full training pipeline
	- CNN-BiLSTM-CTC model (~9.1M parameters)
	- Automatic model saving
	- Download trained model for deployment

	## 🚀 Quick Start

	### Option 1: Analyze Dataset (Local/Colab)
	```bash
	jupyter notebook analysis.ipynb
	```
	- No GPU needed
	- Generates 5 EDA charts
	- Fast (~2 minutes)

	### Option 2: Train Model (Google Colab GPU)

	1. Upload `train_colab.ipynb` to Google Colab
	2. Change runtime to GPU:
	- Runtime → Change runtime type → GPU (T4 recommended)
	3. Run all cells
	4. Download trained model (last cell)

	Training Time: ~1-2 hours for 20 epochs on T4 GPU

	## 📊 Charts Generated

	From `analysis.ipynb`:
	1. `charts/01_sample_images.png` - 10 sample handwritten texts
	2. `charts/02_text_length_distribution.png` - Text statistics
	3. `charts/03_image_dimensions.png` - Image analysis
	4. `charts/04_character_frequency.png` - Character distribution
	5. `charts/05_summary_statistics.png` - Summary table

	## 🎯 Model Details

	Architecture:
	- CNN: 7 convolutional blocks (feature extraction)
	- BiLSTM: 2 layers, 256 hidden units (sequence modeling)
	- CTC Loss: Alignment-free training

	Dataset: Teklia/IAM-line (Hugging Face)
	- Train: 6,482 samples
	- Validation: 976 samples
	- Test: 2,915 samples

	Metrics:
	- CER (Character Error Rate)
	- WER (Word Error Rate)

	## 💾 Model Files

	After training in Colab:
	- `best_model.pth` - Trained model weights
	- `training_history.png` - Loss/CER/WER plots
	- `predictions.png` - Sample predictions

	## 📦 Requirements

	```
	torch>=2.0.0
	datasets>=2.14.0
	pillow>=9.5.0
	numpy>=1.24.0
	matplotlib>=3.7.0
	seaborn>=0.13.0
	jupyter>=1.0.0
	jiwer>=3.0.0
	```

	## 🔧 Usage

	### Load Trained Model
	```python
	import torch

	# Load checkpoint
	checkpoint = torch.load('best_model.pth')
	char_mapper = checkpoint['char_mapper']

	# Create model
	from train_colab import CRNN # Copy model class
	model = CRNN(num_chars=len(char_mapper.chars))
	model.load_state_dict(checkpoint['model_state_dict'])
	model.eval()

	# Predict
	# ... (preprocessing + inference)
	```

	## 📝 Notes

	- GPU strongly recommended for training (use Colab T4)
	- Training on CPU will be extremely slow (~20x slower)
	- Colab free tier: 12-hour limit, sufficient for 20 epochs
	- Model checkpoint includes character mapper for deployment

	## 🎓 Training Tips

	1. Start with fewer epochs (5-10) to test
	2. Monitor CER/WER - stop if not improving
	3. Increase epochs if still improving (up to 50)
	4. Save checkpoint before Colab disconnects
	5. Download model immediately after training

	## 📄 License

	Dataset: IAM Database (research use)