lmoncla's picture
Update README.md
6211df5 verified
---
license: cc-by-nc-4.0
language:
- fr
base_model:
- google/mt5-small
pipeline_tag: text-generation
citation: |
@inproceedings{moncla2026edda,
title={EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates},
author={Moncla, Ludovic and Nugues, Pierre and Joliveau, Thierry and McDonough, Katherine},
booktitle={Proceedings of the 2026 Language Resources and Evaluation Conference (LREC 2026)},
year={2026},
url={https://arxiv.org/abs/2602.23941}
}
---
# Model Card of `GEODE/mt5-small-coords-norm`
This model is fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) for extracting and normalizing geographic coordinates from texts.
### Overview
- **Language model:** [google/mt5-small](https://huggingface.co/google/mt5-small)
- **Language:** French
- **Training data:**
- **Online Demo:** [https://huggingface.co/spaces/GEODE/edda-coordinates](https://huggingface.co/spaces/GEODE/edda-coordinates)
- **Repository:** []()
### Usage
```python
from transformers import pipeline
pipe = pipeline("text2text-generation", model="GEODE/mt5-small-coords-norm")
pipe("* AACH ou ACH, s. f. petite ville d'Allemagne dans le cercle de Souabe, près de la source de l'Aach. Long. 26. 57. lat. 47. 55.")
```
## Evaluation
### 5-Fold Cross-Validation Results
| Metric | Score |
|:-----------------|---------:|
| Mean Exact Match | 0.8365 |
| Mean Char F1 | 0.9675 |
## Training hyperparameters
The following hyperparameters were used during fine-tuning:
- dataset_path:
- dataset_name: edda-coordinata
- input_types: ['encyclopedic_text_entry', 'dms_coordinates']
- output_types: 'dms_coordinates'
- model: google/mt5-small
- max_length: 512
- max_length_output: 128
- epoch: 10
- batch: 8
- lr: 0.0005
- random_seed: 42
- gradient_accumulation_steps: 1
## Citation
If you use the **EDDA-Coordinata** dataset or the associated models, please cite our LREC 2026 paper:
```bibtex
@inproceedings{moncla2026edda,
title={EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates},
author={Moncla, Ludovic and Nugues, Pierre and Joliveau, Thierry and McDonough, Katherine},
booktitle={Proceedings of the 2026 Language Resources and Evaluation Conference (LREC 2026)},
year={2026},
url={[https://arxiv.org/abs/2602.23941](https://arxiv.org/abs/2602.23941)}
}
```