File size: 3,916 Bytes
a4a1e96
 
 
 
 
 
 
 
 
 
 
 
 
9ee7619
 
 
 
 
b48c91f
 
a4a1e96
 
 
 
 
 
 
ddb821e
a4a1e96
ddb821e
a4a1e96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ddb821e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4a1e96
 
 
 
 
 
9ee7619
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
license: apache-2.0
base_model: bioformers/bioformer-16L
tags:
- generated_from_trainer
metrics:
- f1
- precision
- recall
- accuracy
model-index:
- name: cl_ct_custom_model
  results: []
datasets:
- tner/bionlp2004
language:
- en
pipeline_tag: token-classification
inference: true
library_name: transformers
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# cl_ct_custom_model

This model is a fine-tuned version of [bioformers/bioformer-16L](https://huggingface.co/bioformers/bioformer-16L) on the (https://huggingface.co/datasets/tner/bionlp2004) dataset.
It achieves the following results on the evaluation set:

- Loss: 0.2590
- F1: 0.7609
- Precision: 0.7112
- Recall: 0.8181
- Accuracy: 0.9229

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 3407
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch  | Step | Validation Loss | F1     | Precision | Recall | Accuracy |
|:-------------:|:------:|:----:|:---------------:|:------:|:---------:|:------:|:--------:|
| 0.4568        | 0.9971 | 259  | 0.2146          | 0.8139 | 0.7920    | 0.8370 | 0.9326   |
| 0.2115        | 1.9981 | 519  | 0.1907          | 0.8349 | 0.8125    | 0.8586 | 0.9379   |
| 0.1802        | 2.9990 | 779  | 0.1912          | 0.8407 | 0.8178    | 0.8650 | 0.9394   |
| 0.164         | 4.0    | 1039 | 0.1869          | 0.8449 | 0.8255    | 0.8652 | 0.9401   |
| 0.1518        | 4.9971 | 1298 | 0.1819          | 0.8525 | 0.8348    | 0.8710 | 0.9428   |
| 0.1424        | 5.9981 | 1558 | 0.1842          | 0.8506 | 0.8351    | 0.8666 | 0.9422   |
| 0.134         | 6.9990 | 1818 | 0.1869          | 0.8539 | 0.8373    | 0.8712 | 0.9428   |
| 0.128         | 8.0    | 2078 | 0.1889          | 0.8540 | 0.8374    | 0.8712 | 0.9429   |
| 0.1241        | 8.9971 | 2337 | 0.1892          | 0.8559 | 0.8401    | 0.8724 | 0.9432   |
| 0.1199        | 9.9711 | 2590 | 0.1899          | 0.8552 | 0.8392    | 0.8718 | 0.9431   |

## Eval Classification report

| Class       | Precision | Recall | F1-Score | Support |
|-------------|------------|--------|----------|---------|
| DNA         | 0.78       | 0.84   | 0.81     | 2494    |
| RNA         | 0.83       | 0.89   | 0.86     | 238     |
| Cell Line   | 0.81       | 0.85   | 0.83     | 1050    |
| Cell Type   | 0.74       | 0.79   | 0.77     | 775     |
| Protein     | 0.88       | 0.90   | 0.89     | 6196    |
| **Micro Avg** | **0.84**   | **0.87** | **0.86** | **10753** |
| **Macro Avg** | **0.81**   | **0.86** | **0.83** | **10753** |
| **Weighted Avg** | **0.84**   | **0.87** | **0.86** | **10753** |


## Test Results

| Class       | Precision | Recall | F1-Score | Support |
|-------------|-----------|--------|----------|---------|
| DNA         | 0.74      | 0.79   | 0.76     | 2210    |
| RNA         | 0.73      | 0.76   | 0.75     | 287     |
| Cell Line   | 0.50      | 0.76   | 0.61     | 1057    |
| Cell Type   | 0.75      | 0.68   | 0.71     | 2761    |
| Protein     | 0.72      | 0.87   | 0.79     | 10082   |
| **Micro Avg** | **0.71**  | **0.82** | **0.76** | **16397** |
| **Macro Avg** | **0.69**  | **0.77** | **0.72** | **16397** |
| **Weighted Avg** | **0.72**  | **0.82** | **0.76** | **16397** |


### Framework versions

- Transformers 4.43.4
- Pytorch 2.4.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1