metadata
license: mit
tags:
- generated_from_trainer
datasets:
- renet
metrics:
- precision
- recall
- f1
- accuracy
model_index:
- name: BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-finetuned-renet
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: renet
type: renet
metric:
name: Accuracy
type: accuracy
value: 0.8640646029609691
BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext-finetuned-renet
A model for detecting gene disease associations from abstracts. The model classifies as 0 for no association, or 1 for some association.
This model is a fine-tuned version of microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext on the RENET2 dataset. Note that this considers only the abstract data, and not the full text information, from RENET2.
It achieves the following results on the evaluation set:
- Loss: 0.7226
- Precision: 0.7799
- Recall: 0.8211
- F1: 0.8
- Accuracy: 0.8641
- Auc: 0.9325
Training procedure
The abstract dataset from RENET2 was split into 85% train, 15% evaluation being grouped by PMIDs and stratified by labels. That is, no data from the same PMID was seen in multiple both the training and the evaluation set.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 1
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Framework versions
- Transformers 4.9.0.dev0
- Pytorch 1.10.0.dev20210630+cu113
- Datasets 1.8.0
- Tokenizers 0.10.3