khmer-ocr-ocm / DEPLOYMENT_GUIDE.md
Vatho's picture
Upload 12 files
b8d5690 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

πŸš€ Complete Guide: Deploying Khmer OCR to Hugging Face Spaces

Overview

This guide walks you through deploying your Khmer OCR model as an interactive web app on Hugging Face Spaces.

πŸ“ Required Files

You need these 6 files:

  1. app.py - Main Gradio application
  2. requirements.txt - Python dependencies
  3. packages.txt - System dependencies (poppler for PDF)
  4. README.md - Documentation and model card
  5. .gitattributes - Git LFS configuration for large files
  6. crnn_khmer_official_natural_v4.pth - Your trained model (you need to provide this)

πŸ”§ Step-by-Step Deployment

Step 1: Create a Hugging Face Account

  1. Go to https://huggingface.co/
  2. Click "Sign Up" (top right)
  3. Create your account
  4. Verify your email

Step 2: Create a New Space

  1. Click your profile picture β†’ "New Space"

  2. Fill in the form:

    • Space name: khmer-ocr (or your preferred name)
    • License: MIT
    • Select the Space SDK: Choose "Gradio"
    • Space hardware: CPU Basic (free) or upgrade if needed
    • Visibility: Public or Private
  3. Click "Create Space"

Step 3: Set Up Git LFS (Large File Storage)

Your model file is large, so you need Git LFS:

# Install Git LFS (one-time setup)
# On Ubuntu/Debian:
sudo apt-get install git-lfs

# On macOS:
brew install git-lfs

# On Windows:
# Download from https://git-lfs.github.com/

# Initialize Git LFS
git lfs install

Step 4: Clone Your Space Repository

# Clone the repository (replace YOUR_USERNAME and khmer-ocr)
git clone https://huggingface.co/spaces/YOUR_USERNAME/khmer-ocr
cd khmer-ocr

Step 5: Add Your Files

Copy all the files into the cloned directory:

# Copy the files
cp /path/to/app.py .
cp /path/to/requirements.txt .
cp /path/to/packages.txt .
cp /path/to/README.md .
cp /path/to/.gitattributes .

# Copy your model file
cp /path/to/crnn_khmer_official_natural_v4.pth .

IMPORTANT: Make sure your model file is named exactly crnn_khmer_official_natural_v4.pth or update the filename in app.py (line 24).

Step 6: Commit and Push

# Track the model file with Git LFS
git lfs track "*.pth"

# Add all files
git add .

# Commit
git commit -m "Initial commit: Khmer OCR model and app"

# Push to Hugging Face
git push

Step 7: Wait for Build

  1. Go to your Space URL: https://huggingface.co/spaces/YOUR_USERNAME/khmer-ocr
  2. The Space will automatically build (takes 2-5 minutes)
  3. You'll see a "Building..." status
  4. Once done, you'll see "Running" and the app will be live!

🎨 Customization Options

Change Model Name

If your model file has a different name, edit app.py:

# Line 24
MODEL_PATH = "your_model_name.pth"

Adjust Line Segmentation Parameters

In app.py, modify the LineSegmenter class parameters:

segmenter = LineSegmenter(
    min_line_height=20,      # Minimum line height in pixels
    max_line_height=100,     # Maximum line height in pixels
    min_line_width=100,      # Minimum line width in pixels
    min_aspect_ratio=2.0,    # Width/height ratio
    header_margin=0.12,      # Skip top 12% of page
    footer_margin=0.10,      # Skip bottom 10% of page
    side_margin=0.05         # Skip left/right 5% of page
)

Change App Title and Description

Edit the gr.Markdown() sections in app.py to customize the interface text.

Add Example Files

Create an examples folder with sample PDFs/images:

gr.Examples(
    examples=[
        ["examples/sample1.pdf"],
        ["examples/sample2.jpg"],
    ],
    inputs=file_input
)

πŸ“Š Hardware Options

Free Tier (CPU Basic)

  • RAM: 16GB
  • CPU: 2 vCPU
  • Storage: 50GB
  • Cost: Free
  • Best for: Testing, low-traffic apps

Upgraded (CPU Upgrade)

  • RAM: 32GB
  • CPU: 8 vCPU
  • Storage: 100GB
  • Cost: ~$0.60/hour
  • Best for: Production, higher traffic

GPU Options

  • T4 GPU: ~$0.60/hour (faster inference)
  • A10G GPU: ~$3.15/hour (much faster)

To upgrade: Space Settings β†’ Hardware β†’ Select plan

πŸ› Troubleshooting

Problem: "Model file not found"

Solution: Make sure the model file is in the root directory and the filename matches exactly in app.py.

Problem: "Out of memory"

Solution:

  1. Reduce batch processing
  2. Upgrade to CPU Upgrade or GPU hardware
  3. Process PDFs page-by-page instead of all at once

Problem: "pdf2image not working"

Solution: Make sure packages.txt includes poppler-utils

Problem: "Build failed"

Solution: Check the Space logs:

  1. Go to your Space
  2. Click "Logs" tab
  3. Look for error messages
  4. Fix the issue and push again

Problem: "App is slow"

Solution:

  1. Upgrade hardware to GPU
  2. Reduce image DPI in app.py (currently 300)
  3. Implement caching for repeated requests

πŸ“ˆ Monitoring

View Usage Stats

  1. Go to your Space
  2. Click "Analytics" tab
  3. See: visitors, requests, errors

View Logs

  1. Click "Logs" tab
  2. See real-time app logs
  3. Debug issues

πŸ” Privacy & Security

For Private Models

If you want to keep your model private:

  1. Set Space visibility to "Private"
  2. Only you can access it
  3. Share with specific users: Settings β†’ Collaborators

API Access

Your Space automatically gets an API endpoint:

from gradio_client import Client

client = Client("YOUR_USERNAME/khmer-ocr")
result = client.predict(
    file_path="path/to/document.pdf"
)
print(result)

πŸ’‘ Pro Tips

  1. Test Locally First

    python app.py
    # Access at http://localhost:7860
    
  2. Use Git LFS for All Large Files

    git lfs track "*.pth"
    git lfs track "*.bin"
    
  3. Add .gitignore

    __pycache__/
    *.pyc
    .DS_Store
    .vscode/
    *.log
    
  4. Version Control

    • Tag releases: git tag v1.0.0
    • Use branches for experiments
    • Keep main branch stable
  5. Documentation

    • Update README.md with examples
    • Add model performance metrics
    • Include usage examples

πŸŽ“ Next Steps

Improve the Model

  1. Collect More Data: Add diverse documents
  2. Fine-tune: Continue training on specific domains
  3. Data Augmentation: Add more variations

Enhance the App

  1. Add Language Selection: Support multiple languages
  2. Batch Processing: Process multiple files at once
  3. Export Options: Save as TXT, JSON, DOCX
  4. Confidence Scores: Show OCR confidence per line
  5. Interactive Editing: Let users correct mistakes

Share Your Work

  1. Write a blog post on Hugging Face
  2. Share on social media
  3. Add to model hub with proper documentation
  4. Create tutorials and examples

πŸ“š Resources

❓ Need Help?


Good luck with your deployment! πŸš€