MatteoFasulo commited on
Commit
017f499
·
verified ·
1 Parent(s): 040a3d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +128 -15
README.md CHANGED
@@ -4,6 +4,9 @@ license: apache-2.0
4
  base_model: answerdotai/ModernBERT-large
5
  tags:
6
  - generated_from_trainer
 
 
 
7
  metrics:
8
  - precision
9
  - recall
@@ -11,33 +14,76 @@ metrics:
11
  - accuracy
12
  model-index:
13
  - name: ModernBERT-large-NER
14
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
19
-
20
  # ModernBERT-large-NER
21
 
22
- This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
23
- It achieves the following results on the evaluation set:
24
- - Loss: 0.0508
25
- - Precision: 0.9230
26
- - Recall: 0.9399
27
- - F1: 0.9314
28
- - Accuracy: 0.9861
29
 
30
  ## Model description
31
 
32
- More information needed
 
 
 
 
 
 
33
 
34
- ## Intended uses & limitations
 
 
 
35
 
36
- More information needed
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ## Training and evaluation data
39
 
40
- More information needed
 
 
 
 
 
 
 
 
 
41
 
42
  ## Training procedure
43
 
@@ -69,3 +115,70 @@ The following hyperparameters were used during training:
69
  - Pytorch 2.7.0a0+ecf3bae40a.nv25.02
70
  - Datasets 4.5.0
71
  - Tokenizers 0.22.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  base_model: answerdotai/ModernBERT-large
5
  tags:
6
  - generated_from_trainer
7
+ - named-entity-recognition
8
+ - token-classification
9
+ - modernbert
10
  metrics:
11
  - precision
12
  - recall
 
14
  - accuracy
15
  model-index:
16
  - name: ModernBERT-large-NER
17
+ results:
18
+ - task:
19
+ type: token-classification
20
+ dataset:
21
+ name: conll2003
22
+ type: conll2003
23
+ metrics:
24
+ - name: Precision
25
+ type: Precision
26
+ value: 0.9230
27
+ - name: Recall
28
+ type: Recall
29
+ value: 0.9399
30
+ - name: F1
31
+ type: F1
32
+ value: 0.9314
33
+ - name: Accuracy
34
+ type: Accuracy
35
+ value: 0.9861
36
+ datasets:
37
+ - lhoestq/conll2003
38
+ language:
39
+ - en
40
+ pipeline_tag: token-classification
41
  ---
42
 
 
 
 
43
  # ModernBERT-large-NER
44
 
45
+ This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) for Named Entity Recognition (NER) tasks on [conll2003](https://huggingface.co/datasets/lhoestq/conll2003) dataset.
 
 
 
 
 
 
46
 
47
  ## Model description
48
 
49
+ ModernBERT-large-NER is a token classification model trained to identify and categorize named entities in text. Built on the ModernBERT-large architecture, this model leverages modern transformer optimizations for efficient and accurate entity extraction.
50
+
51
+ ## Intended Uses
52
+
53
+ **Primary Use Cases:**
54
+ - Named Entity Recognition in text documents
55
+ - Information extraction pipelines
56
 
57
+ **Intended Users:**
58
+ - NLP researchers and practitioners
59
+ - Data scientists working with text data
60
+ - Developers building information extraction systems
61
 
62
+ ## Limitations
63
+
64
+ **Known Limitations:**
65
+ - Performance may vary on domains significantly different from the training data
66
+ - Entity boundaries might be imperfect for complex or nested entities
67
+ - May require domain-specific fine-tuning for specialized applications (medical, legal, etc.)
68
+ - Performance on low-resource languages or code-switched text not evaluated
69
+
70
+ **Out-of-Scope Uses:**
71
+ - Real-time processing of sensitive personal information without proper privacy safeguards
72
+ - High-stakes decision making without human oversight
73
+ - Applications requiring 100% accuracy in entity detection
74
 
75
  ## Training and evaluation data
76
 
77
+ The model was trained on a dataset for named entity recognition. Specific details about the dataset composition, size, and entity types are not publicly disclosed in this release.
78
+
79
+ ## Performance
80
+
81
+ It achieves the following results on the evaluation set:
82
+ - Loss: 0.0508
83
+ - Precision: 0.9230
84
+ - Recall: 0.9399
85
+ - F1: 0.9314
86
+ - Accuracy: 0.9861
87
 
88
  ## Training procedure
89
 
 
115
  - Pytorch 2.7.0a0+ecf3bae40a.nv25.02
116
  - Datasets 4.5.0
117
  - Tokenizers 0.22.2
118
+
119
+ ## How to Use
120
+
121
+ ```python
122
+ import torch
123
+ from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline
124
+
125
+ # Create NER pipeline
126
+ ner_pipeline = pipeline(
127
+ "token-classification",
128
+ model="MatteoFasulo/ModernBERT-large-NER",
129
+ aggregation_strategy="simple",
130
+ dtype=torch.bfloat16,
131
+ )
132
+
133
+ # Example usage
134
+ text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
135
+ entities = ner_pipeline(text)
136
+
137
+ for entity in entities:
138
+ print(
139
+ f"{entity['word']}: {entity['entity_group']} (confidence: {entity['score']:.4f})"
140
+ )
141
+
142
+ # Apple Inc.: ORG (confidence: 0.9684)
143
+ # Steve Jobs: PER (confidence: 0.9950)
144
+ # Cupertino: LOC (confidence: 0.9876)
145
+ # California: LOC (confidence: 0.9939)
146
+ ```
147
+
148
+ ## Ethical Considerations
149
+
150
+ **Privacy:** This model may extract personal information (names, locations, organizations) from text. Users should:
151
+ - Implement appropriate data protection measures
152
+ - Comply with relevant privacy regulations (GDPR, CCPA, etc.)
153
+ - Obtain necessary consent before processing personal data
154
+
155
+ **Bias:** The model's performance may reflect biases present in the training data, potentially affecting:
156
+ - Recognition rates across different demographic groups
157
+ - Entity detection in various cultural contexts
158
+ - Performance on minority or underrepresented entities
159
+
160
+ Users should validate the model's performance on their specific use cases and implement bias mitigation strategies as needed.
161
+
162
+ ## Citation
163
+
164
+ If you use this model in your research, please cite ModernBERT model:
165
+
166
+ ```bibtex
167
+ @misc{modernbert,
168
+ title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
169
+ author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
170
+ year={2024},
171
+ eprint={2412.13663},
172
+ archivePrefix={arXiv},
173
+ primaryClass={cs.CL},
174
+ url={https://arxiv.org/abs/2412.13663},
175
+ }
176
+ ```
177
+
178
+ ## License
179
+
180
+ This model is released under the Apache 2.0 License. See the [LICENSE](LICENSE) file for details.
181
+
182
+ ## Acknowledgments
183
+
184
+ This model was built using the ModernBERT-base architecture from Answer.AI and trained using the Hugging Face Transformers library.