Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,10 @@ base_model:
|
|
| 7 |
pipeline_tag: text-classification
|
| 8 |
widget:
|
| 9 |
- text: >-
|
| 10 |
-
MAEATAE, (Géogr. anc.) anciens peuples de l'île de la grande Bretagne ; ils
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
|
|
@@ -28,9 +31,9 @@ It has been trained on a manually annotated subset of the French *Encyclopédie
|
|
| 28 |
|
| 29 |
<!-- Provide a longer summary of what this model is. -->
|
| 30 |
|
| 31 |
-
- **
|
| 32 |
- **Model type:** Text classification
|
| 33 |
-
- **Repository:** [https://
|
| 34 |
- **Language(s) (NLP):** French
|
| 35 |
- **License:** cc-by-nc-4.0
|
| 36 |
|
|
@@ -41,20 +44,20 @@ It has been trained on a manually annotated subset of the French *Encyclopédie
|
|
| 41 |
The tagset is as follows:
|
| 42 |
- **Place**: encyclopedia entry describing the name of a place (such as a city, a river, a country, etc.)
|
| 43 |
- **Person**: encyclopedia entry describing the name of a people or community
|
| 44 |
-
- **
|
| 45 |
|
| 46 |
|
| 47 |
## Dataset
|
| 48 |
|
| 49 |
|
| 50 |
-
The model was trained using
|
| 51 |
-
The
|
| 52 |
|
| 53 |
| | Train | Validation | Test|
|
| 54 |
|---|:---:|:---:|:---:|
|
| 55 |
-
| Place |
|
| 56 |
-
| Person |
|
| 57 |
-
| Misc |
|
| 58 |
|
| 59 |
|
| 60 |
## Evaluation
|
|
@@ -65,7 +68,7 @@ The datasets have the following distribution of entries among datasets and class
|
|
| 65 |
|
| 66 |
| | Precision | Recall | F-score |
|
| 67 |
|---|:---:|:---:|:---:|
|
| 68 |
-
| | 0.
|
| 69 |
|
| 70 |
|
| 71 |
|
|
@@ -73,9 +76,9 @@ The datasets have the following distribution of entries among datasets and class
|
|
| 73 |
|
| 74 |
| | Precision | Recall | F-score | Support |
|
| 75 |
|---|:---:|:---:|:---:|:---:|
|
| 76 |
-
| Place | 0.
|
| 77 |
-
| Person |
|
| 78 |
-
|
|
| 79 |
|
| 80 |
|
| 81 |
|
|
@@ -106,9 +109,9 @@ for sample in samples:
|
|
| 106 |
print(pipe(sample))
|
| 107 |
|
| 108 |
# Output
|
| 109 |
-
[{'label': 'Place', 'score': 0.
|
| 110 |
-
[{'label': 'Person', 'score': 0.
|
| 111 |
-
[{'label': '
|
| 112 |
|
| 113 |
```
|
| 114 |
|
|
@@ -124,4 +127,4 @@ This model was trained entirely on French encyclopaedic entries classified as Ge
|
|
| 124 |
## Acknowledgement
|
| 125 |
|
| 126 |
The authors are grateful to the [ASLAN project](https://aslan.universite-lyon.fr) (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR).
|
| 127 |
-
Data courtesy the [ARTFL Encyclopédie Project](https://artfl-project.uchicago.edu), University of Chicago.
|
|
|
|
| 7 |
pipeline_tag: text-classification
|
| 8 |
widget:
|
| 9 |
- text: >-
|
| 10 |
+
MAEATAE, (Géogr. anc.) anciens peuples de l'île de la grande Bretagne ; ils
|
| 11 |
+
étoient auprès du mur qui coupoit l'île en deux parties.
|
| 12 |
+
datasets:
|
| 13 |
+
- GEODE/GeoEDdA-TopoRel
|
| 14 |
---
|
| 15 |
|
| 16 |
|
|
|
|
| 31 |
|
| 32 |
<!-- Provide a longer summary of what this model is. -->
|
| 33 |
|
| 34 |
+
- **Authors:** Bin Yang, [Ludovic Moncla](https://ludovicmoncla.github.io), [Fabien Duchateau](https://perso.liris.cnrs.fr/fabien.duchateau/) and [Frédérique Laforest](https://perso.liris.cnrs.fr/flaforest/) in the framework of the [ECoDA](https://liris.cnrs.fr/projet-institutionnel/fil-2025-projet-ecoda) and [GEODE](https://geode-project.github.io) projects
|
| 35 |
- **Model type:** Text classification
|
| 36 |
+
- **Repository:** [https://gitlab.liris.cnrs.fr/ecoda/encyclopedia2geokg](https://gitlab.liris.cnrs.fr/ecoda/encyclopedia2geokg)
|
| 37 |
- **Language(s) (NLP):** French
|
| 38 |
- **License:** cc-by-nc-4.0
|
| 39 |
|
|
|
|
| 44 |
The tagset is as follows:
|
| 45 |
- **Place**: encyclopedia entry describing the name of a place (such as a city, a river, a country, etc.)
|
| 46 |
- **Person**: encyclopedia entry describing the name of a people or community
|
| 47 |
+
- **Other**: encyclopedia entry describing any other type of entity (such as abstract geographic concepts, cross-references to other entries, etc.)
|
| 48 |
|
| 49 |
|
| 50 |
## Dataset
|
| 51 |
|
| 52 |
|
| 53 |
+
The model was trained using the [GeoEDdA-TopoRel](https://huggingface.co/datasets/GEODE/GeoEDdA-TopoRel) dataset.
|
| 54 |
+
The dataset is splitted into train, validation and test sets which have the following distribution of entries among classes:
|
| 55 |
|
| 56 |
| | Train | Validation | Test|
|
| 57 |
|---|:---:|:---:|:---:|
|
| 58 |
+
| Place | 1,800 | 225 | 225|
|
| 59 |
+
| Person | 200 | 25 | 25 |
|
| 60 |
+
| Misc | 200 | 25 | 25 |
|
| 61 |
|
| 62 |
|
| 63 |
## Evaluation
|
|
|
|
| 68 |
|
| 69 |
| | Precision | Recall | F-score |
|
| 70 |
|---|:---:|:---:|:---:|
|
| 71 |
+
| | 0.980 | 0.978 | 0.979 |
|
| 72 |
|
| 73 |
|
| 74 |
|
|
|
|
| 76 |
|
| 77 |
| | Precision | Recall | F-score | Support |
|
| 78 |
|---|:---:|:---:|:---:|:---:|
|
| 79 |
+
| Place | 0.99 | 0.98 | 0.99 | 225 |
|
| 80 |
+
| Person | 1.00 | 0.96 | 0.98 | 25 |
|
| 81 |
+
| Other | 0.83 | 0.96 | 0.89 | 25 |
|
| 82 |
|
| 83 |
|
| 84 |
|
|
|
|
| 109 |
print(pipe(sample))
|
| 110 |
|
| 111 |
# Output
|
| 112 |
+
[{'label': 'Place', 'score': 0.9984742999076843}]
|
| 113 |
+
[{'label': 'Person', 'score': 0.9927592277526855}]
|
| 114 |
+
[{'label': 'Other', 'score': 0.9885557293891907}]
|
| 115 |
|
| 116 |
```
|
| 117 |
|
|
|
|
| 127 |
## Acknowledgement
|
| 128 |
|
| 129 |
The authors are grateful to the [ASLAN project](https://aslan.universite-lyon.fr) (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR).
|
| 130 |
+
Data courtesy the [ARTFL Encyclopédie Project](https://artfl-project.uchicago.edu), University of Chicago.
|