Transformers
PyTorch
French
EvalLLM-GLiNER / README.md
ik-ram28's picture
Update README.md
9019c87 verified
---
library_name: transformers
license: apache-2.0
datasets:
- ik-ram28/synthetic-NER-dataset
language:
- fr
base_model:
- Ihor/gliner-biomed-large-v1.0
---
# EvalLLM-GLiNER-Biomedical
## Model Description
This model is a fine-tuned version of [gliner-biomed-large-v1.0](https://huggingface.co/Ihor/gliner-biomed-large-v1.0) specifically designed for French biomedical Named Entity Recognition (NER). It was developed as part of the EvalLLM 2025 challenge.
The model leverages GLiNER's zero-shot capabilities while being fine-tuned on synthetic biomedical data, making it highly effective for identifying 21 types of biomedical entities in French text.
## Model Details
### Base Model
- **Architecture**: GLiNER (Generalist and Lightweight Model for Named Entity Recognition)
- **Base Version**: gliner-biomed-large-v1.0
- **Language**: French
- **Domain**: Biomedical and health-related text
### Training Configuration
- **Training Epochs**: 3 (early stopping at 2.85 epochs)
- **Learning Rate**: 1e-5
- **Weight Decay**: 0.01
- **Scheduler**: Cosine with 10% warm-up
- **Batch Size**: 8
- **Training Data**: 1,748 synthetic documents
## Entity Types (21 categories)
| Entity Type | French Label | Example |
|-------------|--------------|---------|
| `ABS_DATE` | Date absolue | "15 mars 2020" |
| `ABS_PERIOD` | Période absolue | "janvier 2019 à mars 2020" |
| `BIO_TOXIN` | Toxine biologique | "toxine botulique" |
| `DIS_REF_TO_PATH` | Référence maladie-pathogène | "infection par E. coli" |
| `DOC_AUTHOR` | Auteur de document | "Dr. Martin Dubois" |
| `DOC_DATE` | Date de document | "publié le 12/03/2021" |
| `DOC_SOURCE` | Source de document | "Journal of Medicine" |
| `EVENT_MACRO` | Événement macro | "épidémie de COVID-19" |
| `EVENT_MICRO` | Événement micro | "cas de contamination" |
| `EXPLOSIVE` | Explosif | "TNT", "dynamite" |
| `FUZZY_PERIOD` | Période floue | "début d'année", "récemment" |
| `INF_DISEASE` | Maladie infectieuse | "grippe", "tuberculose" |
| `LOCATION` | Localisation | "Paris", "France" |
| `LOC_REF_TO_ORG` | Référence lieu-organisation | "hôpital de Lyon" |
| `NON_INF_DISEASE` | Maladie non infectieuse | "diabète", "cancer" |
| `ORGANIZATION` | Organisation | "OMS", "Institut Pasteur" |
| `ORG_REF_TO_LOC` | Référence organisation-lieu | "OMS Europe" |
| `PATHOGEN` | Pathogène | "virus Ebola", "E. coli" |
| `PATH_REF_TO_DIS` | Référence pathogène-maladie | "virus causant la grippe" |
| `RADIOISOTOPE` | Radio-isotope | "uranium 235", "césium 137" |
| `REL_DATE` | Date relative | "hier", "la semaine dernière" |
| `REL_PERIOD` | Période relative | "depuis 3 mois" |
| `TOXIC_AGENT` | Agent toxique | "plomb", "mercure" |
## Citation
```bibtex
```
## Related Resources
- **GitHub Repository**: [EvalLLM2025](https://github.com/ikram28/EvalLLM2025)
- **Paper**: [Link to paper when published]
- **Challenge**: [EvalLLM 2025](https://evalllm2025.sciencesconf.org/)
## License
This model is released under the Apache 2.0 License.
## Acknowledgments
- GLiNER team for the base architecture
- EvalLLM 2025 organizers