Deepesh-001
/

rodrigo-ocr-model

vision-language-model

historical-document-processing

handwriting-recognition

Model card Files Files and versions

rodrigo-ocr-model / README.md

Deepesh-001's picture

Update README.md

c76374f verified 11 days ago

|

history blame contribute delete

3.49 kB

	---
	license: apache-2.0
	tags:
	- paddleocr
	- ocr
	- vision-language-model
	- ernie-kit
	- historical-document-processing
	- handwriting-recognition
	- gothic-script
	- paleography
	base_model: PaddlePaddle/PaddleOCR-VL-0.9B
	language:
	- es
	pipeline_tag: image-to-text
	---

	# 📜 Chronos-VL: The 1545 Resurrection Engine

	> 🏆 Baidu ERNIE AI Developer Challenge Submission

	Chronos-VL is a specialized fine-tune of PaddleOCR-VL-0.9B, engineered to decipher Early Modern Spanish Gothic script (c. 1545). Trained on the RODRIGO Corpus using Baidu's ERNIEKit on an NVIDIA A100 GPU, this model bridges the 500-year gap between ancient archives and modern AI.

	While standard OCR models fail on these historical manuscripts due to complex calligraphy, ligatures, and ink degradation, Chronos-VL achieves near-perfect transcription for clear text lines.

	## 📊 Performance Benchmark

	We conducted a side-by-side evaluation on 100 unseen historical samples using a custom A/B testing framework.

	\| Metric \| Baseline (Standard PaddleOCR) \| Chronos-VL (Ours) \| Improvement \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| Median Character Error Rate (CER) \| 19.82% \| 1.64% \| 12x Better \|
	\| Excellent Predictions (<5% Error) \| 1% \| 77% \| 76x Increase \|
	\| Word Error Rate (WER) \| 74.44% \| 17.35% \| 4x Better \|


	## 🚀 Interactive Demo (Colab)

	Don't just take our word for it. Run the Chronos System yourself.
	Our interactive Gradio app allows you to:
	1. Compare Baseline vs. Chronos-VL side-by-side.
	2. Visualize the "X-Ray" overlay (Visual Restoration).
	3. Translate the archaic text to Modern Spanish and English.

	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12ccCTTvJc9G6AfyvGPg0pG528bCXelaK?usp=sharing)

	## 💻 Usage (Python)

	To use this model in your own code, you need `paddleocr` and `huggingface_hub`.

	```python
	from huggingface_hub import snapshot_download
	from paddleocr import PaddleOCR

	# 1. Download the Fine-Tuned Weights
	local_dir = snapshot_download(repo_id="Deepesh-001/rodrigo-ocr-model")

	# 2. Initialize the Engine
	# We use use_angle_cls=True to handle rotated manuscript lines
	ocr = PaddleOCR(
	rec_model_dir=local_dir,
	use_angle_cls=True,
	use_gpu=True
	)

	# 3. Run Inference on a 1545 Manuscript
	image_path = "rodrigo_sample.png"
	result = ocr.ocr(image_path, cls=True)

	for line in result[0]:
	text = line[1][0]
	confidence = line[1][1]
	print(f"Detected: {text} \| Confidence: {confidence:.2f}")
	```

	## 🧠 The Chronos Pipeline (System Design)

	This model is the core perception layer of the broader Chronos System:

	1. Visual Perception (AI): Chronos-VL extracts raw Gothic text (e.g., "dixo estonces").
	2. Semantic Normalization (Logic): A post-processing engine normalizes Archaic Castilian spelling to Modern Spanish (e.g., "dijo entonces").
	3. Global Access (Translation): Automated translation to English, making Spanish heritage accessible to non-Spanish speakers.

	## 📂 Dataset Info
	Trained on the RODRIGO Corpus (Spanish State Archives).
	- Era: 1545
	- Script: Gothic Cursive
	- Size: 9,000 text lines (80/20 Split)
	- Format: Page-XML converted to ERNIEKit JSONL

	## 🔗 Links
	- Code Repository: [ https://github.com/deepeshahlawat/Chronos-VL ]
	- Project Video: [https://www.youtube.com/watch?v=PaK24VT_3Jk]

	Built with ❤️ using PaddlePaddle and ERNIEKit.