MatteoFasulo
/

ModernBERT-large-NER

@@ -4,6 +4,9 @@ license: apache-2.0
 base_model: answerdotai/ModernBERT-large
 tags:
 - generated_from_trainer
 metrics:
 - precision
 - recall
@@ -11,33 +14,76 @@ metrics:
 - accuracy
 model-index:
 - name: ModernBERT-large-NER
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # ModernBERT-large-NER
-This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0508
-- Precision: 0.9230
-- Recall: 0.9399
-- F1: 0.9314
-- Accuracy: 0.9861
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -69,3 +115,70 @@ The following hyperparameters were used during training:
 - Pytorch 2.7.0a0+ecf3bae40a.nv25.02
 - Datasets 4.5.0
 - Tokenizers 0.22.2

 base_model: answerdotai/ModernBERT-large
 tags:
 - generated_from_trainer
+- named-entity-recognition
+- token-classification
+- modernbert
 metrics:
 - precision
 - recall
 - accuracy
 model-index:
 - name: ModernBERT-large-NER
+  results:
+  - task:
+      type: token-classification
+    dataset:
+      name: conll2003
+      type: conll2003
+    metrics:
+    - name: Precision
+      type: Precision
+      value: 0.9230
+    - name: Recall
+      type: Recall
+      value: 0.9399
+    - name: F1
+      type: F1
+      value: 0.9314
+    - name: Accuracy
+      type: Accuracy
+      value: 0.9861
+datasets:
+- lhoestq/conll2003
+language:
+- en
+pipeline_tag: token-classification
 ---
 # ModernBERT-large-NER
+This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) for Named Entity Recognition (NER) tasks on [conll2003](https://huggingface.co/datasets/lhoestq/conll2003) dataset.
 ## Model description
+ModernBERT-large-NER is a token classification model trained to identify and categorize named entities in text. Built on the ModernBERT-large architecture, this model leverages modern transformer optimizations for efficient and accurate entity extraction.
+## Intended Uses
+**Primary Use Cases:**
+- Named Entity Recognition in text documents
+- Information extraction pipelines
+**Intended Users:**
+- NLP researchers and practitioners
+- Data scientists working with text data
+- Developers building information extraction systems
+## Limitations
+**Known Limitations:**
+- Performance may vary on domains significantly different from the training data
+- Entity boundaries might be imperfect for complex or nested entities
+- May require domain-specific fine-tuning for specialized applications (medical, legal, etc.)
+- Performance on low-resource languages or code-switched text not evaluated
+**Out-of-Scope Uses:**
+- Real-time processing of sensitive personal information without proper privacy safeguards
+- High-stakes decision making without human oversight
+- Applications requiring 100% accuracy in entity detection
 ## Training and evaluation data
+The model was trained on a dataset for named entity recognition. Specific details about the dataset composition, size, and entity types are not publicly disclosed in this release.
+## Performance
+It achieves the following results on the evaluation set:
+- Loss: 0.0508
+- Precision: 0.9230
+- Recall: 0.9399
+- F1: 0.9314
+- Accuracy: 0.9861
 ## Training procedure
 - Pytorch 2.7.0a0+ecf3bae40a.nv25.02
 - Datasets 4.5.0
 - Tokenizers 0.22.2
+## How to Use
+```python
+import torch
+from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline
+# Create NER pipeline
+ner_pipeline = pipeline(
+    "token-classification",
+    model="MatteoFasulo/ModernBERT-large-NER",
+    aggregation_strategy="simple",
+    dtype=torch.bfloat16,
+)
+# Example usage
+text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
+entities = ner_pipeline(text)
+for entity in entities:
+    print(
+        f"{entity['word']}: {entity['entity_group']} (confidence: {entity['score']:.4f})"
+    )
+# Apple Inc.: ORG (confidence: 0.9684)
+# Steve Jobs: PER (confidence: 0.9950)
+# Cupertino: LOC (confidence: 0.9876)
+# California: LOC (confidence: 0.9939)
+```
+## Ethical Considerations
+**Privacy:** This model may extract personal information (names, locations, organizations) from text. Users should:
+- Implement appropriate data protection measures
+- Comply with relevant privacy regulations (GDPR, CCPA, etc.)
+- Obtain necessary consent before processing personal data
+**Bias:** The model's performance may reflect biases present in the training data, potentially affecting:
+- Recognition rates across different demographic groups
+- Entity detection in various cultural contexts
+- Performance on minority or underrepresented entities
+Users should validate the model's performance on their specific use cases and implement bias mitigation strategies as needed.
+## Citation
+If you use this model in your research, please cite ModernBERT model:
+```bibtex
+@misc{modernbert,
+      title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
+      author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
+      year={2024},
+      eprint={2412.13663},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2412.13663},
+}
+```
+## License
+This model is released under the Apache 2.0 License. See the [LICENSE](LICENSE) file for details.
+## Acknowledgments
+This model was built using the ModernBERT-base architecture from Answer.AI and trained using the Hugging Face Transformers library.