Spaces:

jeyanthangj2004
/

ocr

Runtime error

App Files Files Community

ocr / HUGGINGFACE_CHECKLIST.md

jeyanthangj2004

Upload 110 files

3f42a6f verified about 2 months ago

preview code

raw

history blame contribute delete

5.62 kB

	# 📦 Hugging Face Deployment - File Checklist

	## ✅ Files Created for Deployment

	### Core Application Files

	- ✅ app.py - Gradio interface with OCR processing
	- ✅ requirements.txt - Python dependencies (Gradio + eDOCr2)
	- ✅ packages.txt - System dependencies (Tesseract, Poppler)
	- ✅ README.md - Space description with YAML frontmatter
	- ✅ .gitattributes - Git LFS configuration for model files

	### Documentation

	- ✅ DEPLOYMENT.md - Complete deployment guide
	- ✅ run_local.bat - Windows quick start script
	- ✅ run_local.sh - Linux/Mac quick start script

	### Required Folders

	- ✅ edocr2/ - Main package (already exists)
	- ✅ edocr2/tools/ - OCR pipelines
	- ✅ edocr2/keras_ocr/ - OCR models
	- ⚠️ edocr2/models/ - Model files (MUST DOWNLOAD)

	- ✅ tests/test_samples/ - Example drawings (optional)

	## 🔴 IMPORTANT: Download Model Files

	Before deploying, download these 4 files and place in `edocr2/models/`:

	1. recognizer_gdts.keras (67.2 MB)
	2. recognizer_gdts.txt (85 bytes)
	3. recognizer_dimensions_2.keras (67.2 MB)
	4. recognizer_dimensions_2.txt (42 bytes)

	Download from: https://github.com/javvi51/edocr2/releases/tag/v1.0.0

	## 📋 Pre-Deployment Checklist

	### Local Testing

	- [ ] Models downloaded and placed in `edocr2/models/`
	- [ ] Run `python app.py` locally
	- [ ] Test with sample images
	- [ ] Verify all outputs (image, JSON, ZIP)

	### Hugging Face Setup

	- [ ] Hugging Face account created
	- [ ] Git LFS installed
	- [ ] New Space created on Hugging Face

	### File Verification

	- [ ] All files present in folder
	- [ ] Model files in correct location
	- [ ] `.gitattributes` configured for LFS
	- [ ] `README.md` has YAML frontmatter

	## 🚀 Deployment Steps

	### 1. Download Models

	Windows PowerShell:
	```powershell
	cd edocr2-main
	New-Item -ItemType Directory -Force -Path edocr2\models
	cd edocr2\models

	Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.keras" -OutFile "recognizer_gdts.keras"
	Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.txt" -OutFile "recognizer_gdts.txt"
	Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.keras" -OutFile "recognizer_dimensions_2.keras"
	Invoke-WebRequest -Uri "https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.txt" -OutFile "recognizer_dimensions_2.txt"

	cd ..\..
	```

	Linux/Mac:
	```bash
	cd edocr2-main
	mkdir -p edocr2/models
	cd edocr2/models

	wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.keras
	wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_gdts.txt
	wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.keras
	wget https://github.com/javvi51/edocr2/releases/download/v1.0.0/recognizer_dimensions_2.txt

	cd ../..
	```

	### 2. Test Locally (Optional but Recommended)

	Windows:
	```bash
	run_local.bat
	```

	Linux/Mac:
	```bash
	chmod +x run_local.sh
	./run_local.sh
	```

	Open: http://localhost:7860

	### 3. Create Hugging Face Space

	1. Go to https://huggingface.co/spaces
	2. Click "Create new Space"
	3. Settings:
	- Name: `edocr2` (or your choice)
	- License: MIT
	- SDK: Gradio
	- Hardware: CPU Basic (free)

	### 4. Clone Space Repository

	```bash
	git clone https://huggingface.co/spaces/YOUR_USERNAME/edocr2
	cd edocr2
	```

	### 5. Copy Files

	Windows:
	```bash
	xcopy /E /I C:\path\to\edocr2-main\* .
	```

	Linux/Mac:
	```bash
	cp -r /path/to/edocr2-main/* .
	```

	### 6. Setup Git LFS

	```bash
	git lfs install
	git lfs track "*.keras"
	git add .gitattributes
	```

	### 7. Commit and Push

	```bash
	git add .
	git commit -m "Initial deployment of eDOCr2"
	git push origin main
	```

	Note: Upload may take 5-10 minutes for large model files.

	### 8. Wait for Build

	- Go to your Space URL
	- Wait 5-10 minutes for build
	- Check "Logs" tab for errors

	## ✅ Verification

	Once deployed:

	- [ ] Space shows Gradio interface
	- [ ] Models load successfully (check logs)
	- [ ] Can upload images
	- [ ] Processing works
	- [ ] Results display correctly
	- [ ] Download ZIP works

	## 🎯 Your Space URL

	After deployment, your Space will be at:

	```
	https://huggingface.co/spaces/YOUR_USERNAME/edocr2
	```

	## 📊 Expected Performance

	### CPU Basic (Free)
	- Processing time: 20-30 seconds per image
	- Memory: 2 GB RAM
	- Cost: FREE

	### T4 GPU (Paid)
	- Processing time: 5-10 seconds per image
	- Memory: 16 GB RAM
	- Cost: $0.60/hour

	## 🐛 Common Issues

	### "Models not found"
	- Ensure models are in `edocr2/models/`
	- Check Git LFS tracked the files
	- Verify file names are correct

	### "Out of memory"
	- Upgrade to GPU hardware
	- Or reduce `max_img_size` in app.py

	### "Build failed"
	- Check logs for specific error
	- Verify all dependencies in requirements.txt
	- Ensure packages.txt has system deps

	## 📚 Resources

	- Deployment Guide: See `DEPLOYMENT.md`
	- Hugging Face Docs: https://huggingface.co/docs/hub/spaces
	- Gradio Docs: https://gradio.app/docs
	- Original Repo: https://github.com/javvi51/edocr2

	## 🎉 Success!

	Once deployed, share your Space:

	```
	🔗 https://huggingface.co/spaces/YOUR_USERNAME/edocr2
	```

	---

	Questions? Check `DEPLOYMENT.md` for detailed troubleshooting.