Instructions to use tjoab/latex_finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tjoab/latex_finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="tjoab/latex_finetuned")# Load model directly from transformers import AutoTokenizer, AutoModelForImageTextToText tokenizer = AutoTokenizer.from_pretrained("tjoab/latex_finetuned") model = AutoModelForImageTextToText.from_pretrained("tjoab/latex_finetuned") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use tjoab/latex_finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "tjoab/latex_finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tjoab/latex_finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/tjoab/latex_finetuned
- SGLang
How to use tjoab/latex_finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "tjoab/latex_finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tjoab/latex_finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "tjoab/latex_finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "tjoab/latex_finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use tjoab/latex_finetuned with Docker Model Runner:
docker model run hf.co/tjoab/latex_finetuned
TrOCR-LaTeX (fine-tuned on math handwriting)
Take your handwritten math and turn it into clean LaTeX code.
This is a fine-tuned version of microsoft/trocr-base-handwritten,
a transformer-based optical character recognition model, adapted to work with handwritten math images and structured math syntax.
Data
Fine-tuned on Google's MathWriting dataset. Contains over 500,000 digital inks of handwritten mathematical expressions obtained through either manual labelling or programmatic generation.
Intended use & limitations
You can use this model for OCR on a single math expression.
There is degraded performance on very long expressions (due to image preprocessing, 3:2 aspect ratio seems to work best).
- Create an expression chunking scheme to split the image into subimages and process each to bypass this limitation.
- In order to process multiple expressions, you need to chuck groups into single expressions.
How to use (PyTorch)
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
# Helper funtion (path to either JPEG or PNG)
def open_PIL_image(image_path: str) -> Image.Image:
image = Image.open(image_path)
if image_path.split('.')[-1].lower() == 'png':
image = Image.composite(image, PIL.Image.new('RGB', image.size, 'white'), image)
return image
# Load model and processor from Hugging Face
processor = TrOCRProcessor.from_pretrained('tjoab/latex_finetuned')
model = VisionEncoderDecoderModel.from_pretrained('tjoab/latex_finetuned')
# Load all images as a batch
images = [open_PIL_image(path) for path in paths]
# Preprocess the images
preproc_image = processor.image_processor(images=images, return_tensors="pt").pixel_values
# Generate and decode the tokens
# NOTE: max_length default value is very small, which often results in truncated inference if not set
pred_ids = model.generate(preproc_image, max_length=128)
latex_preds = processor.batch_decode(pred_ids, skip_special_tokens=True)
Training Details
- Mini-batch size: 8
- Optimizer: Adam
- LR Scheduler: cosine
fp16mixed precision- Trained using automatic mixed precision (AMP) with
torch.cuda.ampfor reduced memory usage.
- Trained using automatic mixed precision (AMP) with
- Gradient accumulation
- Used to simulate a larger effective batch size while keeping per-step memory consumption low.
- Optimizer steps occurred every 8 mini-batches.
Evaluation
Performance was evaluated using Character Error Rate (CER) defined as:
CER = (Substitutions + Insertions + Deletions) / Total Characters in Ground Truth
โ Why CER?
- Math expressions are structurally sensitive. Shuffling even a single character can completely change the meaning.
x^2vs.x_2\frac{a}{b}vs.\frac{b}{a}
- CER will penalizes small error in syntax.
- Math expressions are structurally sensitive. Shuffling even a single character can completely change the meaning.
Evalution yeilded a CER of 14.9%.
BibTeX and Citation
The original TrORC model was introduced in this paper:
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models by Li et al.
You can find the source code in their repository.
@misc{li2021trocr,
title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models},
author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
year={2021},
eprint={2109.10282},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 35