Update README.md
Browse files
README.md
CHANGED
|
@@ -40,10 +40,10 @@ It achieves the following results on the evaluation set:
|
|
| 40 |
|
| 41 |
The model is fine-tuned with academic publications in Linguistics, to classify texts in publications into 4 classes as a filter to other tasks. Sentence-based data obtained from OCR-processed PDF files was annotated manually with the following classes:
|
| 42 |
|
| 43 |
-
0: out of scope - materials that are of low significance, eg. page number and page header, noise from OCR/pdf-to-text convertion
|
| 44 |
-
1: main text - texts that are the main texts of the publication, to be used for down-stream tasks
|
| 45 |
-
2: examples - texts that are captions of the figures, or quotes or excerpts
|
| 46 |
-
3: references - references of the publication, excluding in-text citations
|
| 47 |
|
| 48 |
## Intended uses & limitations
|
| 49 |
|
|
|
|
| 40 |
|
| 41 |
The model is fine-tuned with academic publications in Linguistics, to classify texts in publications into 4 classes as a filter to other tasks. Sentence-based data obtained from OCR-processed PDF files was annotated manually with the following classes:
|
| 42 |
|
| 43 |
+
- 0: out of scope - materials that are of low significance, eg. page number and page header, noise from OCR/pdf-to-text convertion
|
| 44 |
+
- 1: main text - texts that are the main texts of the publication, to be used for down-stream tasks
|
| 45 |
+
- 2: examples - texts that are captions of the figures, or quotes or excerpts
|
| 46 |
+
- 3: references - references of the publication, excluding in-text citations
|
| 47 |
|
| 48 |
## Intended uses & limitations
|
| 49 |
|