bert-base-cased-sci-units-ner

This model is a fine-tuned version of bert-base-cased on the PQA part of the bowenxian/BioProBench dataset It achieves the following results on the evaluation set:

Loss: 0.0175
Precision: 0.9873
Recall: 0.9867
F1: 0.9870
Accuracy: 0.9962

Model description

The model has been trained to perform token classification task by training the bert-base-cased model. The tokens to be classified correspond to the values and units of scientific measurements.

For example in the sentence:

"Place the seeds in a refrigerator at 4°C along with a small amount of water for 2-3 days."

The model will select "4°C" and identify the value as 4 and the unit as °C

"Centrifuge at 863g for 5 min at room temperature (18–28°C), decant supernatant and resuspend cells in culture medium."

The model will identify to value-unit combinations:

VALUE : 863, UNIT: g
VALUE : 18 - 28, UNIT: '°C'

Intended uses & limitations

Identify VALUES and scientific UNITS from a sentence.

This is a work in progress and currently only identifies the units:

Temperature: '°C'
Mass (grams): 'g, ug, mg'
Volume (L): 'L, uL, mL'

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.0684	1.0	682	0.0268	0.9814	0.9765	0.9790	0.9937
0.0194	2.0	1364	0.0195	0.9870	0.9837	0.9853	0.9954
0.0067	3.0	2046	0.0175	0.9873	0.9867	0.9870	0.9962

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 64

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for m1969m/bert-base-cased-sci-units-ner

Base model

google-bert/bert-base-cased

Finetuned

(2894)

this model

m1969m
/

bert-base-cased-sci-units-ner