Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,8 @@ tags: []
|
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
-
|
|
|
|
| 11 |
|
| 12 |
|
| 13 |
|
|
@@ -15,9 +16,13 @@ weaufia' faopwf oain lk<!-- Provide a quick summary of what the model is/does. -
|
|
| 15 |
|
| 16 |
### Model Description
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
|
|
|
| 21 |
|
| 22 |
- **Developed by:** [More Information Needed]
|
| 23 |
- **Funded by [optional]:** [More Information Needed]
|
|
|
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
+
This is a ModernBERT that it has been trained with Latex Files, of mathematical papers. To improve the ability of reading latex files especialy the mathematical equations and parts.
|
| 11 |
+
I
|
| 12 |
|
| 13 |
|
| 14 |
|
|
|
|
| 16 |
|
| 17 |
### Model Description
|
| 18 |
|
| 19 |
+
It's been trained with 12099 mathematical papers. Where we did some preprocessing to eliminate the non-content-meaningfull parts of the papers.
|
| 20 |
+
And we removed the latex parts that bring no information: as \\(?:begin|end)\{[^}]+\}" ,\item ,\\(?:noindent|medskip|smallskip|bigskip|newpage|clearpage)
|
| 21 |
+
And more.
|
| 22 |
+
The dataset is obtained by scrapping mathematical papers.
|
| 23 |
|
| 24 |
+
|
| 25 |
+
One day I will finish this README, if you have a question, feel free to send me a mail: garciacomapol@gmail.com
|
| 26 |
|
| 27 |
- **Developed by:** [More Information Needed]
|
| 28 |
- **Funded by [optional]:** [More Information Needed]
|