File size: 3,493 Bytes
2f9e59f 72b333d 2f9e59f c76374f 72b333d 2f9e59f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
---
license: apache-2.0
tags:
- paddleocr
- ocr
- vision-language-model
- ernie-kit
- historical-document-processing
- handwriting-recognition
- gothic-script
- paleography
base_model: PaddlePaddle/PaddleOCR-VL-0.9B
language:
- es
pipeline_tag: image-to-text
---
# π Chronos-VL: The 1545 Resurrection Engine
> **π Baidu ERNIE AI Developer Challenge Submission**
**Chronos-VL** is a specialized fine-tune of **PaddleOCR-VL-0.9B**, engineered to decipher Early Modern Spanish Gothic script (c. 1545). Trained on the **RODRIGO Corpus** using Baidu's **ERNIEKit** on an NVIDIA A100 GPU, this model bridges the 500-year gap between ancient archives and modern AI.
While standard OCR models fail on these historical manuscripts due to complex calligraphy, ligatures, and ink degradation, Chronos-VL achieves near-perfect transcription for clear text lines.
## π Performance Benchmark
We conducted a side-by-side evaluation on 100 unseen historical samples using a custom A/B testing framework.
| Metric | Baseline (Standard PaddleOCR) | Chronos-VL (Ours) | Improvement |
| :--- | :--- | :--- | :--- |
| **Median Character Error Rate (CER)** | 19.82% | **1.64%** | **12x Better** |
| **Excellent Predictions (<5% Error)** | 1% | **77%** | **76x Increase** |
| **Word Error Rate (WER)** | 74.44% | **17.35%** | **4x Better** |
## π Interactive Demo (Colab)
Don't just take our word for it. Run the **Chronos System** yourself.
Our interactive Gradio app allows you to:
1. **Compare** Baseline vs. Chronos-VL side-by-side.
2. **Visualize** the "X-Ray" overlay (Visual Restoration).
3. **Translate** the archaic text to Modern Spanish and English.
[](https://colab.research.google.com/drive/12ccCTTvJc9G6AfyvGPg0pG528bCXelaK?usp=sharing)
## π» Usage (Python)
To use this model in your own code, you need `paddleocr` and `huggingface_hub`.
```python
from huggingface_hub import snapshot_download
from paddleocr import PaddleOCR
# 1. Download the Fine-Tuned Weights
local_dir = snapshot_download(repo_id="Deepesh-001/rodrigo-ocr-model")
# 2. Initialize the Engine
# We use use_angle_cls=True to handle rotated manuscript lines
ocr = PaddleOCR(
rec_model_dir=local_dir,
use_angle_cls=True,
use_gpu=True
)
# 3. Run Inference on a 1545 Manuscript
image_path = "rodrigo_sample.png"
result = ocr.ocr(image_path, cls=True)
for line in result[0]:
text = line[1][0]
confidence = line[1][1]
print(f"Detected: {text} | Confidence: {confidence:.2f}")
```
## π§ The Chronos Pipeline (System Design)
This model is the core perception layer of the broader **Chronos System**:
1. **Visual Perception (AI):** Chronos-VL extracts raw Gothic text (e.g., *"dixo estonces"*).
2. **Semantic Normalization (Logic):** A post-processing engine normalizes Archaic Castilian spelling to Modern Spanish (e.g., *"dijo entonces"*).
3. **Global Access (Translation):** Automated translation to English, making Spanish heritage accessible to non-Spanish speakers.
## π Dataset Info
Trained on the **RODRIGO Corpus** (Spanish State Archives).
- **Era:** 1545
- **Script:** Gothic Cursive
- **Size:** 9,000 text lines (80/20 Split)
- **Format:** Page-XML converted to ERNIEKit JSONL
## π Links
- **Code Repository:** [ https://github.com/deepeshahlawat/Chronos-VL ]
- **Project Video:** [https://www.youtube.com/watch?v=PaK24VT_3Jk]
*Built with β€οΈ using PaddlePaddle and ERNIEKit.* |