| ## What's the point of this? | |
| LaTeX is the de-facto standard markup language for typesetting pretty equations in academic papers. | |
| It is extremely feature rich and flexible but very verbose. | |
| This makes it great for typesetting complex equations, but not very convenient for quick note-taking on the fly. | |
| For example, here's a short equation from [this page](https://en.wikipedia.org/wiki/Quantum_electrodynamics) on Wikipedia about Quantum Electrodynamics | |
| and the corresponding LaTeX code: | |
|  | |
| ``` | |
| {\displaystyle {\mathcal {L}}={\bar {\psi }}(i\gamma ^{\mu }D_{\mu }-m)\psi -{\frac {1}{4}}F_{\mu \nu }F^{\mu \nu },} | |
| ``` | |
| This demo is a first step in solving that problem. | |
| Eventually, you'll be able to take a quick screenshot of an equation from a paper | |
| and a program built with this model will generate its corresponding LaTeX source code | |
| so that you can just copy/paste straight into your personal notes. | |
| No more endless googling obscure LaTeX syntax! | |
| ## How does it work? | |
| Because this problem involves looking at an image and generating valid LaTeX code, | |
| the model needs to understand both Computer Vision (CV) and Natural Language Processing (NLP). | |
| There are some other projects that aim to solve the same problem with some very interesting architectures | |
| that generally involve some kind of "encoder" that looks at the image and extracts and encodes the information about the equation from the image, | |
| and a "decoder" that takes that information and translates it into what is hopefully both valid and accurate LaTeX code. | |
| Examples: | |
| ... | |
| I chose to tackle this problem with transfer learning. | |
| The biggest reason for this is computing constraints - | |
| I don't have unlimited access to GPU hours and wanted training to be reasonably fast, on the order of a couple of hours. | |
| There are some other benefits to this approach, | |
| e.g. the architecture is already proven to be robust enough for various applications, so less time spent on trial and error. | |
| I chose TrOCR, an OCR machine learning model trained by Microsoft on SRIOE data to produce text from receipts. | |
| <p style='text-align: center'>Made by Young Ho Shin</p> | |
| <p style='text-align: center'> | |
| <a href = "mailto: yhshin.data@gmail.com">Email</a> | | |
| <a href='https://www.github.com/yhshin11'>Github</a> | | |
| <a href='https://www.linkedin.com/in/young-ho-shin-3995051b9/'>Linkedin</a> | |
| </p> |