| | --- |
| | language: |
| | - en |
| | license: mit |
| | tags: |
| | - handwriting-recognition |
| | - ocr |
| | - computer-vision |
| | - pytorch |
| | - crnn |
| | - ctc |
| | - iam-dataset |
| | library_name: pytorch |
| | datasets: |
| | - Teklia/IAM-line |
| | metrics: |
| | - cer |
| | - wer |
| | model-index: |
| | - name: handwriting-recognition-iam |
| | results: |
| | - task: |
| | type: image-to-text |
| | name: Handwriting Recognition |
| | dataset: |
| | name: IAM Handwriting Database |
| | type: Teklia/IAM-line |
| | metrics: |
| | - type: cer |
| | value: 0.1295 |
| | name: Character Error Rate |
| | - type: wer |
| | value: 0.4247 |
| | name: Word Error Rate |
| | --- |
| | |
| | # Handwriting Recognition |
| |
|
| | Complete handwriting recognition system using CNN-BiLSTM-CTC on the IAM dataset. |
| |
|
| | ## π Files |
| |
|
| | ### 1. **analysis.ipynb** - Dataset Analysis |
| | - Exploratory Data Analysis (EDA) |
| | - 5 detailed charts saved to `charts/` folder |
| | - Run locally or on Colab (no GPU needed) |
| |
|
| | ### 2. **train_colab.ipynb** - Model Training (GPU) |
| | - **β‘ Google Colab GPU compatible** |
| | - Full training pipeline |
| | - CNN-BiLSTM-CTC model (~9.1M parameters) |
| | - Automatic model saving |
| | - Download trained model for deployment |
| | |
| | ## π Quick Start |
| | |
| | ### Option 1: Analyze Dataset (Local/Colab) |
| | ```bash |
| | jupyter notebook analysis.ipynb |
| | ``` |
| | - No GPU needed |
| | - Generates 5 EDA charts |
| | - Fast (~2 minutes) |
| | |
| | ### Option 2: Train Model (Google Colab GPU) |
| | |
| | 1. **Upload `train_colab.ipynb` to Google Colab** |
| | 2. **Change runtime to GPU:** |
| | - Runtime β Change runtime type β GPU (T4 recommended) |
| | 3. **Run all cells** |
| | 4. **Download trained model** (last cell) |
| |
|
| | **Training Time:** ~1-2 hours for 20 epochs on T4 GPU |
| |
|
| | ## π Charts Generated |
| |
|
| | From `analysis.ipynb`: |
| | 1. `charts/01_sample_images.png` - 10 sample handwritten texts |
| | 2. `charts/02_text_length_distribution.png` - Text statistics |
| | 3. `charts/03_image_dimensions.png` - Image analysis |
| | 4. `charts/04_character_frequency.png` - Character distribution |
| | 5. `charts/05_summary_statistics.png` - Summary table |
| |
|
| | ## π― Model Details |
| |
|
| | **Architecture:** |
| | - **CNN**: 7 convolutional blocks (feature extraction) |
| | - **BiLSTM**: 2 layers, 256 hidden units (sequence modeling) |
| | - **CTC Loss**: Alignment-free training |
| |
|
| | **Dataset:** Teklia/IAM-line (Hugging Face) |
| | - Train: 6,482 samples |
| | - Validation: 976 samples |
| | - Test: 2,915 samples |
| |
|
| | **Metrics:** |
| | - **CER** (Character Error Rate) |
| | - **WER** (Word Error Rate) |
| |
|
| | ## πΎ Model Files |
| |
|
| | After training in Colab: |
| | - `best_model.pth` - Trained model weights |
| | - `training_history.png` - Loss/CER/WER plots |
| | - `predictions.png` - Sample predictions |
| |
|
| | ## π¦ Requirements |
| |
|
| | ``` |
| | torch>=2.0.0 |
| | datasets>=2.14.0 |
| | pillow>=9.5.0 |
| | numpy>=1.24.0 |
| | matplotlib>=3.7.0 |
| | seaborn>=0.13.0 |
| | jupyter>=1.0.0 |
| | jiwer>=3.0.0 |
| | ``` |
| |
|
| | ## π§ Usage |
| |
|
| | ### Load Trained Model |
| | ```python |
| | import torch |
| | |
| | # Load checkpoint |
| | checkpoint = torch.load('best_model.pth') |
| | char_mapper = checkpoint['char_mapper'] |
| | |
| | # Create model |
| | from train_colab import CRNN # Copy model class |
| | model = CRNN(num_chars=len(char_mapper.chars)) |
| | model.load_state_dict(checkpoint['model_state_dict']) |
| | model.eval() |
| | |
| | # Predict |
| | # ... (preprocessing + inference) |
| | ``` |
| |
|
| | ## π Notes |
| |
|
| | - **GPU strongly recommended** for training (use Colab T4) |
| | - Training on CPU will be extremely slow (~20x slower) |
| | - Colab free tier: 12-hour limit, sufficient for 20 epochs |
| | - Model checkpoint includes character mapper for deployment |
| |
|
| | ## π Training Tips |
| |
|
| | 1. **Start with fewer epochs** (5-10) to test |
| | 2. **Monitor CER/WER** - stop if not improving |
| | 3. **Increase epochs** if still improving (up to 50) |
| | 4. **Save checkpoint** before Colab disconnects |
| | 5. **Download model immediately** after training |
| |
|
| | ## π License |
| |
|
| | Dataset: IAM Database (research use) |
| |
|