File size: 4,589 Bytes
3ba6a9d ae1d809 3ba6a9d ae1d809 7bc81d5 ae1d809 c5b5c64 ae1d809 3ba6a9d ae1d809 3ba6a9d ae1d809 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | ---
title: FastCDM
emoji: ⚡️
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
short_description: High-performance LaTeX formula evaluation tool using KaTeX.
tags:
- latex
- formula-recognition
- evaluation
- katex
---
<div align="center">
# ⚡️FastCDM
[**[GitHub Repo]**](https://github.com/SoMarkAI/FastCDM) | [**[HuggingFace Spaces]**](https://huggingface.co/spaces/SoMark/FastCDM)
<p>
<a href="https://pypi.org/project/fastcdm/">
<img src="https://img.shields.io/badge/pypi-v0.1.1-blue"
alt="PyPI package version">
</a>
<a href="https://www.python.org">
<img src="https://img.shields.io/badge/python-3.8%2B-blue"
alt="Python versions">
</a>
<a href="#">
<img src="https://img.shields.io/badge/license-Apache%202.0-blue"
alt="GitHub license">
</a>
</p>
</div>
## 🚀 Introduction
[CDM](https://github.com/opendatalab/UniMERNet/tree/main/cdm) ensures the objectivity and accuracy of evaluation by rendering predicted and ground-truth LaTeX formulas into images, and then using visual feature extraction and localization techniques to perform precise character-level matching, combined with spatial position information.
**FastCDM** aims to address performance issues. As a high-performance optimized version of the original [CDM](https://github.com/opendatalab/UniMERNet/tree/main/cdm), FastCDM employs the browser-based KaTeX rendering engine instead of traditional LaTeX compilation, resulting in significantly improved speed.
### 🎯 Project Goals
The core objective of FastCDM is to **provide a convenient user experience during the training process**, helping to advance formula recognition tasks. We are committed to:
- Providing simple and easy-to-use API interfaces for convenient integration of evaluation within the training loop.
- Supporting both real-time evaluation and batch evaluation modes.
- Providing visualization tools for evaluation metrics during the training process.
### Why Choose FastCDM?
1. **Extreme Performance**: Based on the KaTeX rendering engine, it is tens of times faster than the traditional LaTeX compilation process.
2. **Simplified Deployment**: No need to install complex LaTeX environments (ImageMagick, texlive-full, etc.).
3. **Accurate Evaluation**: Adopts character detection matching methods to avoid the unfairness issues associated with traditional text metrics.
4. **Continuous Optimization**: Supplements and refines CDM symbol support, with continuous iterative improvements.
5. **Easy Integration**: Provides a unified API interface for easy integration into various training frameworks. Future integration with mainstream training frameworks such as PyTorch and Transformers is planned.
### ⚠️ Note
Although KaTeX is extremely fast, it is a lightweight solution optimized for the Web and cannot support **100%** of all obscure LaTeX syntax.
For the vast majority of standard formulas, it performs perfectly. This is a reasonable and sustainable technical choice.
You can check KaTeX's support coverage here: 🔗 [KaTeX Support Table](https://katex.org/docs/support_table)
---
## Usage
### Installation
You need to install `node.js` and `chromedriver` in advance.
* For `node.js` installation, please refer to [here](https://nodejs.org/).
* For `chromedriver` installation, please refer to [here](docs/chromedriver_installation.md).
```bash
pip install fastcdm
```
### Quick Start
```python
from fastcdm import FastCDM
chromedriver_path = "driver/chromedriver"
# Initialize FastCDM evaluator
evaluator = FastCDM(chromedriver_path=chromedriver_path)
# Evaluate
cdm_score, recall, precision = evaluator.compute(gt="E = mc^2", pred="E + 1 = mc^2", visualize=False)
# Evaluate and visualize
cdm_score, recall, precision, vis_img = evaluator.compute(gt="E = mc^2", pred="E + 1 = mc^2", visualize=True)
```
### Interactive Demo
We provide a visualization Demo developed with Gradio, which you can try on [HuggingFace Spaces](https://huggingface.co/spaces/SoMark/FastCDM). You can also launch it locally:
```bash
python3 scripts/app.py
```
## Contribution and Feedback
We welcome all forms of contribution, including but not limited to:
- Submitting issue reports
- Suggesting improvements
- Submitting code changes (please open an issue for discussion first)
Please contact us via the project's [issues](https://github.com/SoMarkAI/FastCDM/issues).
---
## License
This project is open-sourced under the Apache 2.0 license. You are free to use, modify, and distribute the code of this project under the terms of the license.
|