rodrigo-ocr-model / README.md
Deepesh-001's picture
Update README.md
c76374f verified
---
license: apache-2.0
tags:
- paddleocr
- ocr
- vision-language-model
- ernie-kit
- historical-document-processing
- handwriting-recognition
- gothic-script
- paleography
base_model: PaddlePaddle/PaddleOCR-VL-0.9B
language:
- es
pipeline_tag: image-to-text
---
# πŸ“œ Chronos-VL: The 1545 Resurrection Engine
> **πŸ† Baidu ERNIE AI Developer Challenge Submission**
**Chronos-VL** is a specialized fine-tune of **PaddleOCR-VL-0.9B**, engineered to decipher Early Modern Spanish Gothic script (c. 1545). Trained on the **RODRIGO Corpus** using Baidu's **ERNIEKit** on an NVIDIA A100 GPU, this model bridges the 500-year gap between ancient archives and modern AI.
While standard OCR models fail on these historical manuscripts due to complex calligraphy, ligatures, and ink degradation, Chronos-VL achieves near-perfect transcription for clear text lines.
## πŸ“Š Performance Benchmark
We conducted a side-by-side evaluation on 100 unseen historical samples using a custom A/B testing framework.
| Metric | Baseline (Standard PaddleOCR) | Chronos-VL (Ours) | Improvement |
| :--- | :--- | :--- | :--- |
| **Median Character Error Rate (CER)** | 19.82% | **1.64%** | **12x Better** |
| **Excellent Predictions (<5% Error)** | 1% | **77%** | **76x Increase** |
| **Word Error Rate (WER)** | 74.44% | **17.35%** | **4x Better** |
## πŸš€ Interactive Demo (Colab)
Don't just take our word for it. Run the **Chronos System** yourself.
Our interactive Gradio app allows you to:
1. **Compare** Baseline vs. Chronos-VL side-by-side.
2. **Visualize** the "X-Ray" overlay (Visual Restoration).
3. **Translate** the archaic text to Modern Spanish and English.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12ccCTTvJc9G6AfyvGPg0pG528bCXelaK?usp=sharing)
## πŸ’» Usage (Python)
To use this model in your own code, you need `paddleocr` and `huggingface_hub`.
```python
from huggingface_hub import snapshot_download
from paddleocr import PaddleOCR
# 1. Download the Fine-Tuned Weights
local_dir = snapshot_download(repo_id="Deepesh-001/rodrigo-ocr-model")
# 2. Initialize the Engine
# We use use_angle_cls=True to handle rotated manuscript lines
ocr = PaddleOCR(
rec_model_dir=local_dir,
use_angle_cls=True,
use_gpu=True
)
# 3. Run Inference on a 1545 Manuscript
image_path = "rodrigo_sample.png"
result = ocr.ocr(image_path, cls=True)
for line in result[0]:
text = line[1][0]
confidence = line[1][1]
print(f"Detected: {text} | Confidence: {confidence:.2f}")
```
## 🧠 The Chronos Pipeline (System Design)
This model is the core perception layer of the broader **Chronos System**:
1. **Visual Perception (AI):** Chronos-VL extracts raw Gothic text (e.g., *"dixo estonces"*).
2. **Semantic Normalization (Logic):** A post-processing engine normalizes Archaic Castilian spelling to Modern Spanish (e.g., *"dijo entonces"*).
3. **Global Access (Translation):** Automated translation to English, making Spanish heritage accessible to non-Spanish speakers.
## πŸ“‚ Dataset Info
Trained on the **RODRIGO Corpus** (Spanish State Archives).
- **Era:** 1545
- **Script:** Gothic Cursive
- **Size:** 9,000 text lines (80/20 Split)
- **Format:** Page-XML converted to ERNIEKit JSONL
## πŸ”— Links
- **Code Repository:** [ https://github.com/deepeshahlawat/Chronos-VL ]
- **Project Video:** [https://www.youtube.com/watch?v=PaK24VT_3Jk]
*Built with ❀️ using PaddlePaddle and ERNIEKit.*