ALBERT-base-v2-French-NER (ONNX Optimized)
This model is a fine-tuned version of ALBERT-base-v2 for Named Entity Recognition (NER) on French text, using the wikiner_fr dataset.
It is designed to be ultra-lightweight (approximately 45 MB) and fast, making it ideal for edge applications or environments with limited CPU resources.
Performance
The model was trained for 2 epochs and achieves the following scores on the validation set:
- F1-Score: 0.8303
- Precision: 0.8306
- Recall: 0.8299
- Accuracy: 97.54%
Supported Entities
The model detects 4 types of entities:
- PER: Persons
- LOC: Locations
- ORG: Organizations
- MISC: Miscellaneous entities
Usage and Source Code
The complete code for training, evaluation, and inference (including optimized ONNX export) is available on GitHub:
Juste-Leo2/ALBERT-base-v2-french-ner
Quick Inference Example
from transformers import pipeline
# Direct loading via Transformers
ner_pipeline = pipeline(
"token-classification",
model="JusteLeo/ALBERT-base-v2-french-ner",
aggregation_strategy="simple"
)
text = "Bonjour, mon prénom est Thomas et j'habite à Paris."
results = ner_pipeline(text)
for entity in results:
print(f"{entity['entity_group']} : {entity['word']} ({entity['score']:.2%})")
ONNX Optimization
The model is provided with an ONNX Runtime compatible configuration for increased execution speed on modern processors. You can find optimized conversion and inference scripts in the GitHub repository mentioned above.
License: Apache 2.0 Author: JusteLeo
Model tree for JusteLeo/ALBERT-base-v2-french-ner
Base model
albert/albert-base-v2Dataset used to train JusteLeo/ALBERT-base-v2-french-ner
Evaluation results
- F1-Score on wikiner_frtest set self-reported0.830