YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Khmer OCR Recognition Model
๐ฐ๐ญ High-accuracy OCR model for Khmer text recognition using PaddleOCR framework
Model Overview
This CRNN-based OCR model is specifically trained for Khmer (Cambodian) text recognition, achieving 98.45% accuracy on validation data. The model is optimized for recognizing short text segments (3-5 words) commonly found in documents, signs, and printed materials.
๐๏ธ Model Architecture
- Framework: PaddleOCR 2.7+
- Algorithm: CRNN (Convolutional Recurrent Neural Network)
- Backbone: ResNet34
- Neck: SequenceEncoder with RNN (hidden_size: 256)
- Head: CTCHead with CTC Loss
- Input Shape:
[3, 32, 320](channels, height, width) - Max Text Length: 25 characters
๐ Supported Characters
The model recognizes 188 characters including:
- Khmer Consonants: แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แ แก แข
- Khmer Vowels: แถ แท แธ แน แบ แป แผ แฝ แพ แฟ แ แ แ แ แ แ แ แ แ
- Khmer Numerals: แ แก แข แฃ แค แฅ แฆ แง แจ แฉ
- Latin Characters: A-Z, a-z, 0-9
- Punctuation: . , ! ? - ( ) [ ] ยซ ยป โข ยฎ etc.
- Khmer Symbols: แ แ แ แ แ แ แ แ แ แ แ แ
๐ Quick Start
Installation
pip install paddlepaddle paddleocr opencv-python
Basic Usage
from paddleocr import PaddleOCR
import cv2
# Initialize OCR with custom Khmer model
ocr = PaddleOCR(
use_angle_cls=True,
lang='ch', # Use Chinese as base language
rec_model_dir='path/to/model', # Directory containing inference files
rec_char_dict_path='khmer_char_dict.txt',
show_log=False
)
# Process image
result = ocr.ocr('khmer_text_image.jpg', cls=True)
# Extract results
for idx in range(len(result)):
res = result[idx]
if res is None:
continue
for line in res:
text = line[1][0] # Recognized text
confidence = line[1][1] # Confidence score
print(f'Text: {text}, Confidence: {confidence:.3f}')
Command Line Usage
# Download model files to a directory
# Then use PaddleOCR tools:
python tools/infer/predict_rec.py \
--image_dir="your_khmer_image.png" \
--rec_model_dir="path/to/model" \
--rec_char_dict_path="khmer_char_dict.txt"
๐ Files Included
| File | Size | Description |
|---|---|---|
inference.pdiparams |
~106MB | Main model weights |
inference.yml |
~2KB | Model configuration |
inference.json |
~1KB | Model metadata |
khmer_char_dict.txt |
~2KB | Character dictionary (188 characters) |
training_config.yml |
~2KB | Original training configuration |
๐ง Training Details
Dataset Characteristics
- Text Length: 3-5 words per image (optimized for short segments)
- Image Size: 600ร80 pixels (training), resized to 320ร32 for inference
- Font: KhmerOS TTF
- Background: White background with black text
- Augmentation: Clean, blurred, noisy, and noise+blur variants
Training Configuration
- Epochs: 30 (best model at epoch 29)
- Optimizer: Adam with ฮฒโ=0.9, ฮฒโ=0.999
- Learning Rate: Cosine scheduling (initial: 0.001)
- Batch Size: 32
- Loss Function: CTC Loss
- Regularization: L2 (factor: 4e-05)
๐ก Usage Tips
Best Practices
- Image Quality: Use high-contrast images with clear text
- Text Length: Optimal for 3-5 word segments (model's training focus)
- Resolution: Images should be reasonably sized (not too small)
- Preprocessing: Consider using text detection for full documents
For Long Text Documents
Since this model is optimized for short segments, for full documents:
- Use Text Detection: Combine with PaddleOCR's detection model
- Segment Text: Break long lines into 3-5 word chunks
- Post-process: Combine results from multiple segments
# Example for full document processing
ocr = PaddleOCR(
use_angle_cls=True,
lang='ch',
det_model_dir='path/to/detection/model', # Add detection model
rec_model_dir='path/to/this/model', # This Khmer recognition model
rec_char_dict_path='khmer_char_dict.txt'
)
# This will detect text regions AND recognize them
result = ocr.ocr('full_document.jpg', cls=True)
๐ Model Conversion
This model was exported from PaddlePaddle training format to inference format:
# Original export command used:
python tools/export_model.py \
-c pretrainoutput/config.yml \
-o Global.pretrained_model=pretrainoutput/best_accuracy.pdparams \
Global.save_inference_dir=pretrainoutput/inference
๐ ๏ธ Requirements
paddlepaddle>=2.4.0
opencv-python>=4.5.0
numpy>=1.19.0
pillow>=8.0.0
@misc{khmer-ocr-2025,
title={Khmer OCR Recognition Model},
author={[Your Name]},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/[your-username]/khmer-ocr}}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support