SciBERT-based PEFT adapters
Description
This repository provides SciBERT-based parameter-efficient fine-tuning (PEFT) adapters using LoRA and DoRA for climate change–related NLP tasks. The models adapt allenai/scibert_scivocab_uncased to climate-science text via masked language modeling on a climate corpus of 400k scientific sentences, partitioned into nested subsets of different sizes (20k–400k).
The models are intended as drop-in SciBERT adapters for climate-focused downstream tasks such as document classification (e.g., SciDCC) and claim verification (e.g., Climate-FEVER), where users want domain adaptation with substantially fewer trainable parameters than full fine-tuning.
Parameters
Each checkpoint corresponds to a specific PEFT method (LoRA or DoRA) and a specific corpus size, enabling users to choose a trade-off between adaptation strength, stability, and the number of trainable parameters. All adapters keep SciBERT frozen and only train lightweight modules:
Low-rank dimension: 8
Scaling factor: 32
Dropout: 0.25
Target modules: query and value attention projections
MLM pretraining: 3 epochs, max sequence length 350, 20% masking
Seed: 13
Best models
The best models inside from personal testing are Lora 150k and Dora250k
Model tree for MarkoRibaric/MarrLoraDora
Base model
allenai/scibert_scivocab_uncased