Instructions to use tjoab/latex_finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tjoab/latex_finetuned with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="tjoab/latex_finetuned")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("tjoab/latex_finetuned")
model = AutoModelForMultimodalLM.from_pretrained("tjoab/latex_finetuned")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use tjoab/latex_finetuned with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "tjoab/latex_finetuned"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tjoab/latex_finetuned",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/tjoab/latex_finetuned

SGLang

How to use tjoab/latex_finetuned with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "tjoab/latex_finetuned" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tjoab/latex_finetuned",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "tjoab/latex_finetuned" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tjoab/latex_finetuned",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use tjoab/latex_finetuned with Docker Model Runner:
```
docker model run hf.co/tjoab/latex_finetuned
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

TrOCR-LaTeX (fine-tuned on math handwriting)

Take your handwritten math and turn it into clean LaTeX code. This is a fine-tuned version of microsoft/trocr-base-handwritten, a transformer-based optical character recognition model, adapted to work with handwritten math images and structured math syntax.

Data

Fine-tuned on Google's MathWriting dataset. Contains over 500,000 digital inks of handwritten mathematical expressions obtained through either manual labelling or programmatic generation.

Intended use & limitations

You can use this model for OCR on a single math expression.

There is degraded performance on very long expressions (due to image preprocessing, 3:2 aspect ratio seems to work best).

Create an expression chunking scheme to split the image into subimages and process each to bypass this limitation.
In order to process multiple expressions, you need to chuck groups into single expressions.

How to use (PyTorch)

from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image

# Helper funtion (path to either JPEG or PNG)
def open_PIL_image(image_path: str) -> Image.Image:
  image = Image.open(image_path)
  if image_path.split('.')[-1].lower() == 'png':
      image = Image.composite(image, PIL.Image.new('RGB', image.size, 'white'), image)
  return image


# Load model and processor from Hugging Face
processor = TrOCRProcessor.from_pretrained('tjoab/latex_finetuned')
model = VisionEncoderDecoderModel.from_pretrained('tjoab/latex_finetuned')


# Load all images as a batch
images = [open_PIL_image(path) for path in paths]

# Preprocess the images 
preproc_image = processor.image_processor(images=images, return_tensors="pt").pixel_values

# Generate and decode the tokens
# NOTE: max_length default value is very small, which often results in truncated inference if not set 
pred_ids = model.generate(preproc_image, max_length=128)
latex_preds = processor.batch_decode(pred_ids, skip_special_tokens=True)

Training Details

Mini-batch size: 8
Optimizer: Adam
LR Scheduler: cosine
fp16 mixed precision
- Trained using automatic mixed precision (AMP) with torch.cuda.amp for reduced memory usage.
Gradient accumulation
- Used to simulate a larger effective batch size while keeping per-step memory consumption low.
- Optimizer steps occurred every 8 mini-batches.

Evaluation

Performance was evaluated using Character Error Rate (CER) defined as:

CER = (Substitutions + Insertions + Deletions) / Total Characters in Ground Truth

✅ Why CER?
- Math expressions are structurally sensitive. Shuffling even a single character can completely change the meaning.
  - x^2 vs. x_2
  - \frac{a}{b} vs. \frac{b}{a}
- CER will penalizes small error in syntax.
Evalution yeilded a CER of 14.9%.

BibTeX and Citation

The original TrORC model was introduced in this paper:

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models by Li et al.

You can find the source code in their repository.

@misc{li2021trocr,
      title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models}, 
      author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
      year={2021},
      eprint={2109.10282},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Downloads last month: 159

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tjoab/latex_finetuned

Base model

microsoft/trocr-base-handwritten

Finetuned

(38)

this model

Quantizations

1 model

Spaces using tjoab/latex_finetuned 2

Paper for tjoab/latex_finetuned

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

Paper • 2109.10282 • Published Sep 21, 2021 • 13