ocr / HUGGINGFACE_CHECKLIST.md
jeyanthangj2004's picture
Upload 110 files
3f42a6f verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

πŸ“¦ Hugging Face Deployment - File Checklist

βœ… Files Created for Deployment

Core Application Files

  • βœ… app.py - Gradio interface with OCR processing
  • βœ… requirements.txt - Python dependencies (Gradio + eDOCr2)
  • βœ… packages.txt - System dependencies (Tesseract, Poppler)
  • βœ… README.md - Space description with YAML frontmatter
  • βœ… .gitattributes - Git LFS configuration for model files

Documentation

  • βœ… DEPLOYMENT.md - Complete deployment guide
  • βœ… run_local.bat - Windows quick start script
  • βœ… run_local.sh - Linux/Mac quick start script

Required Folders

  • βœ… edocr2/ - Main package (already exists)

    • βœ… edocr2/tools/ - OCR pipelines
    • βœ… edocr2/keras_ocr/ - OCR models
    • ⚠️ edocr2/models/ - Model files (MUST DOWNLOAD)
  • βœ… tests/test_samples/ - Example drawings (optional)

πŸ”΄ IMPORTANT: Download Model Files

Before deploying, download these 4 files and place in edocr2/models/:

  1. recognizer_gdts.keras (67.2 MB)
  2. recognizer_gdts.txt (85 bytes)
  3. recognizer_dimensions_2.keras (67.2 MB)
  4. recognizer_dimensions_2.txt (42 bytes)

Download from: https://github.com/javvi51/edocr2/releases/tag/v1.0.0

πŸ“‹ Pre-Deployment Checklist

Local Testing

  • Models downloaded and placed in edocr2/models/
  • Run python app.py locally
  • Test with sample images
  • Verify all outputs (image, JSON, ZIP)

Hugging Face Setup

  • Hugging Face account created
  • Git LFS installed
  • New Space created on Hugging Face

File Verification

  • All files present in folder
  • Model files in correct location
  • .gitattributes configured for LFS
  • README.md has YAML frontmatter

πŸš€ Deployment Steps

1. Download Models

Windows PowerShell:

cd edocr2-main
New-Item -ItemType Directory -Force -Path edocr2\models
cd edocr2\models

Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.keras" -OutFile "recognizer_gdts.keras"
Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.txt" -OutFile "recognizer_gdts.txt"
Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.keras" -OutFile "recognizer_dimensions_2.keras"
Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.txt" -OutFile "recognizer_dimensions_2.txt"

cd ..\..

Linux/Mac:

cd edocr2-main
mkdir -p edocr2/models
cd edocr2/models

wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.keras
wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.txt
wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.keras
wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.txt

cd ../..

2. Test Locally (Optional but Recommended)

Windows:

run_local.bat

Linux/Mac:

chmod +x run_local.sh
./run_local.sh

Open: http://localhost:7860

3. Create Hugging Face Space

  1. Go to https://huggingface.co/spaces
  2. Click "Create new Space"
  3. Settings:
    • Name: edocr2 (or your choice)
    • License: MIT
    • SDK: Gradio
    • Hardware: CPU Basic (free)

4. Clone Space Repository

git clone https://huggingface.co/spaces/YOUR_USERNAME/edocr2
cd edocr2

5. Copy Files

Windows:

xcopy /E /I C:\path\to\edocr2-main\* .

Linux/Mac:

cp -r /path/to/edocr2-main/* .

6. Setup Git LFS

git lfs install
git lfs track "*.keras"
git add .gitattributes

7. Commit and Push

git add .
git commit -m "Initial deployment of eDOCr2"
git push origin main

Note: Upload may take 5-10 minutes for large model files.

8. Wait for Build

  • Go to your Space URL
  • Wait 5-10 minutes for build
  • Check "Logs" tab for errors

βœ… Verification

Once deployed:

  • Space shows Gradio interface
  • Models load successfully (check logs)
  • Can upload images
  • Processing works
  • Results display correctly
  • Download ZIP works

🎯 Your Space URL

After deployment, your Space will be at:

https://huggingface.co/spaces/YOUR_USERNAME/edocr2

πŸ“Š Expected Performance

CPU Basic (Free)

  • Processing time: 20-30 seconds per image
  • Memory: 2 GB RAM
  • Cost: FREE

T4 GPU (Paid)

  • Processing time: 5-10 seconds per image
  • Memory: 16 GB RAM
  • Cost: $0.60/hour

πŸ› Common Issues

"Models not found"

  • Ensure models are in edocr2/models/
  • Check Git LFS tracked the files
  • Verify file names are correct

"Out of memory"

  • Upgrade to GPU hardware
  • Or reduce max_img_size in app.py

"Build failed"

  • Check logs for specific error
  • Verify all dependencies in requirements.txt
  • Ensure packages.txt has system deps

πŸ“š Resources

πŸŽ‰ Success!

Once deployed, share your Space:

πŸ”— https://huggingface.co/spaces/YOUR_USERNAME/edocr2

Questions? Check DEPLOYMENT.md for detailed troubleshooting.