lmoncla's picture
Update README.md
00be902 verified
---
license: cc-by-nc-4.0
language:
- fr
pipeline_tag: text-classification
widget:
- text: >-
* ALBI, (Géog.) ville de France, capitale de l'Albigeois, dans le haut
Languedoc : elle est sur le Tarn. Long. 19. 49. lat. 43. 55. 44.
---
# bert-base-multilingual-cased-edda-domain-classification
<!-- Provide a quick summary of what the model is/does. -->
This model is designed to classify encyclopedia articles into knowledge domains (e.g., History, Geography, Medicine, ...).
It is a fine-tuned version of the bert-base-multilingual-cased model.
It has been trained on the French *Encyclopédie ou dictionnaire raisonné des sciences des arts et des métiers par une société de gens de lettres (1751-1772)* edited by Diderot and d'Alembert (provided by the [ARTFL Encyclopédie Project](https://artfl-project.uchicago.edu)).
## Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** [Alice Brenon](https://perso.liris.cnrs.fr/abrenon/), [Ludovic Moncla](https://ludovicmoncla.github.io), [Katherine McDonough](https://www.lancaster.ac.uk/dsi/about-us/members/katherine-mcdonough#projects), and Khaled Chabane in the framework of the [GEODE](https://geode-project.github.io) project.
- **Model type:** Text classification
- **Repository:** [https://gitlab.liris.cnrs.fr/geode/EDdA-Classification/](https://gitlab.liris.cnrs.fr/geode/EDdA-Classification/)
- **Language(s) (NLP):** French
- **License:** cc-by-nc-4.0
## Class labels
%TODO
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
This model was trained entirely on French encyclopaedic entries and will likely not perform well on text in other languages or other corpora.
## Cite this work
> Brenon, A., Moncla, L., & McDonough, K. (2022). Classifying encyclopedia articles: Comparing machine and deep learning methods and exploring their predictions. Data & Knowledge Engineering, 142, 102098.
## Acknowledgement
The authors are grateful to the [ASLAN project](https://aslan.universite-lyon.fr) (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR).
Data courtesy the [ARTFL Encyclopédie Project](https://artfl-project.uchicago.edu), University of Chicago.