m4xi
/

texo-unimer-replication

pytorch-lightning

Model card Files Files and versions

texo-unimer-replication / README.md

m4xi's picture

Update README.md

2c5d339 verified about 1 month ago

|

history blame contribute delete

1.63 kB

	---
	tags:
	- math-ocr
	- latex
	- unimer
	- pytorch-lightning
	---

	# Texo — UniMER replication (step 1750 checkpoint)

	Full fine-tune of [PP-FormulaNet-S](https://huggingface.co/alephpi/FormulaNet) on [m4xi/unimer-merged](https://huggingface.co/datasets/m4xi/unimer-merged) for math formula recognition (image → LaTeX).

	## Base model

	PP-FormulaNet-S (`formulanet.pt` from alephpi/FormulaNet): 58M parameter model with HGNetV2 encoder + MBart decoder. This is the distilled checkpoint from the original Texo paper, prior to any UniMER training.

	## Training

	\| \| \|
	\|---\|---\|
	\| Dataset \| m4xi/unimer-merged (~1.04M train samples, 98/2 train/val split) \|
	\| Fine-tune strategy \| Full (no LoRA) \|
	\| Effective batch size \| 64 (16 per device × 4 grad accum) \|
	\| Learning rate \| 1e-4, cosine decay \|
	\| Precision \| bf16 \|
	\| Steps \| 1,750 (~0.11 epochs, ~112K samples seen) \|
	\| Hardware \| NVIDIA RTX 4090, ~42h total runtime (crashed) \|

	## Checkpoint selection

	The run crashed after ~70k steps. Validation BLEU peaked at step 1750 (BLEU=0.7247, edit_distance=0.0582) and degraded monotonically after that. `save_top_k=5` retained only the five best-BLEU checkpoints, all of which fell within the first ~15% of epoch 1; consistent with rapid convergence from a strong pretrained base.

	## Usage

	```python
	import torch
	from task import FormulaNetLit # Texo src/task.py

	model = FormulaNetLit.load_from_checkpoint("texo-cp-1750.ckpt", map_location="cpu")
	model.eval()

	outputs = model.generate(pixel_values, num_beams=1, do_sample=False, max_new_tokens=512)
	pred = model.tokenizer.batch_decode(outputs, skip_special_tokens=True)
	```