--- language: - km license: apache-2.0 tags: - ocr - transformer - vision pipeline_tag: image-to-text --- # ViTOCR This repository contains a pure Transformer-based checkpoint for Khmer OCR. Images are patch-embedded and encoded by a Transformer encoder, then decoded autoregressively. ## Installation ```python pip install onnxruntime pillow torch torchvision numpy ``` ## Get the inference script Download from this model repo: ```bash curl -L -o onnx_inference.py https://huggingface.co/metythorn/ViTOCR-base/resolve/main/onnx_inference.py ``` Or copy `onnx_inference.py` from the repository files into your project directory. ## Usage ```python from onnx_inference import ONNXPredictor predictor = ONNXPredictor( model_path="model.onnx", config_path="config.json", providers=["CPUExecutionProvider"], # or include CUDAExecutionProvider if available ) result = predictor.predict("sample_image.png") print("Predicted text:", result) ```