Sharpaxis
/

BERT-NER-CoNLL

+---
+library_name: transformers
+license: apache-2.0
+base_model: bert-base-uncased
+tags:
+- generated_from_trainer
+datasets:
+- conll2003
+metrics:
+- f1
+model-index:
+- name: BERT-NER-CoNLL
+  results:
+  - task:
+      name: Token Classification
+      type: token-classification
+    dataset:
+      name: conll2003
+      type: conll2003
+      config: conll2003
+      split: test
+      args: conll2003
+    metrics:
+    - name: F1
+      type: f1
+      value: 0.903734876380852
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# BERT-NER-CoNLL
+This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the conll2003 dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1199
+- F1: 0.9037
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:------:|
+| 0.1298        | 1.0   | 878  | 0.1098          | 0.8849 |
+| 0.0355        | 2.0   | 1756 | 0.1139          | 0.9012 |
+| 0.0202        | 3.0   | 2634 | 0.1199          | 0.9037 |
+### Framework versions
+- Transformers 4.47.0
+- Pytorch 2.5.1+cu121
+- Datasets 3.2.0
+- Tokenizers 0.21.0

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "bert-large-uncased",
   "architectures": [
     "BertForTokenClassification"
   ],
@@ -8,7 +8,7 @@
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "hidden_dropout_prob": 0.1,
-  "hidden_size": 1024,
   "id2label": {
     "0": "O",
     "1": "B-PER",
@@ -21,7 +21,7 @@
     "8": "I-MISC"
   },
   "initializer_range": 0.02,
-  "intermediate_size": 4096,
   "label2id": {
     "B-LOC": 5,
     "B-MISC": 7,
@@ -36,8 +36,8 @@
   "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,
   "model_type": "bert",
-  "num_attention_heads": 16,
-  "num_hidden_layers": 24,
   "pad_token_id": 0,
   "position_embedding_type": "absolute",
   "torch_dtype": "float32",

 {
+  "_name_or_path": "bert-base-uncased",
   "architectures": [
     "BertForTokenClassification"
   ],
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
   "id2label": {
     "0": "O",
     "1": "B-PER",
     "8": "I-MISC"
   },
   "initializer_range": 0.02,
+  "intermediate_size": 3072,
   "label2id": {
     "B-LOC": 5,
     "B-MISC": 7,
   "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,
   "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
   "pad_token_id": 0,
   "position_embedding_type": "absolute",
   "torch_dtype": "float32",

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4f1aa56528ed4b4845a575d24806836ace270e86df537b76bb0e8677ae5724af
-size 1336452868

 version https://git-lfs.github.com/spec/v1
+oid sha256:93adbba002de4da818263e110638eae20bfcc998bc12c7f3874dcaddcd71594d
+size 435617620