ncbi/pubmed
Updated • 1.27k • 163
How to use raynardj/roberta-pubmed with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="raynardj/roberta-pubmed") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("raynardj/roberta-pubmed")
model = AutoModelForMaskedLM.from_pretrained("raynardj/roberta-pubmed")We limit the training textual data to the following MeSH
Biomarkers, Tumor(D014408), including things like Carcinoembryonic Antigen(D002272)Carcinoma(D002277), including things like all kinds of carcinoma: like Carcinoma, Lewis Lung(D018827) etc. around 80 kinds of carcinomaClinical Trial(D016439)mlm_probability=0.15, on 2 Tesla V100 32Gtraining_args = TrainingArguments(
output_dir=config.save, #select model path for checkpoint
overwrite_output_dir=True,
num_train_epochs=3,
per_device_train_batch_size=30,
per_device_eval_batch_size=60,
evaluation_strategy= 'steps',
save_total_limit=2,
eval_steps=250,
metric_for_best_model='eval_loss',
greater_is_better=False,
load_best_model_at_end =True,
prediction_loss_only=True,
report_to = "none")