| --- |
| library_name: transformers |
| license: apache-2.0 |
| language: |
| - en |
| base_model: |
| - allenai/scibert_scivocab_uncased |
| pipeline_tag: text-classification |
| --- |
| |
| # Model Card for Model ID |
|
|
|
|
| This is a text classification model. |
| It was fine-tuned to predict certainty ratings of scientific findings using a classification loss and a ranking loss. |
| We fine-tuned an allenai/scibert_scivocab_uncased on the dataset made available by [Wurl et al (2024): Understanding Fine-Grained Distortions in Reports for Scientific Finding.](https://aclanthology.org/2024.findings-acl.369/). |
|
|
|
|
| ## Model Details |
|
|
| ### Model Description |
|
|
| <!-- Provide a longer summary of what this model is. --> |
|
|
| This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
| - **Developed by:** Researchers at UCI with the goal of obtaining a reliable certainty scoring function. |
| - **Model type:** BERT |
| - **Language(s) (NLP):** English |
| - **Finetuned from model:** allenai/scibert_scivocab_uncased |
|
|
| ## Uses |
|
|
| The model is meant to be used for estimating certainty scores. Because it is trained on sentence-level academic findings, we suspect its reliability to be restricted to this domain. |
| The original dataset had only moderate inter-annotator agreement (spearman correlation coefficient of 0.44), which suggests that predicting certainty scores is difficult even for humans. |
| We recommend users of this model to validate that the model behaves as intended in a small portion of the data of interest before scaling evaluations. |
| We also note that the per-class F1 scores ranged between (0.48-0.70), which reflects once again the difficulty in learning clear class boundaries. |
|
|
|
|
| ## How to Get Started with the Model |
|
|
| Use the code below to get started with the model. |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| |
| tokenizer = AutoTokenizer.from_pretrained("Cbelem/scibert-certainty-classif") |
| model = AutoModelForSequenceClassification.from_pretrained("Cbelem/scibert-certainty-classif") |
| model.eval() |
| |
| texts = [ |
| "Compared with controls, taxi drivers had greater grey matter volume in the posterior hippocampi (Maguire et al.", |
| "The study described in this paper focuses on gaze, but similar approaches can be used to understand the effects of other interactions that contribute to patient outcomes such as emotion.", |
| '""The initial findings could have been explained by a correlation, that people with big hippocampi become taxi drivers,"" he says.', |
| "We are less sure about a possible explanation for lower acceptance for mobile phone behaviors among professionals in the West.", |
| ] |
| |
| inputs_ids = tokenizer(texts, return_tensors="pt") |
| model(**inputs_ids) |
| ``` |
|
|
| ## Training Details |
|
|
| ### Training Data |
|
|
| TBD |
|
|
| ### Training Procedure |
|
|
| TBD |
|
|
| #### Preprocessing [optional] |
|
|
| TBD |
|
|
|
|
| #### Training Hyperparameters |
|
|
| - **Training regime:** fp32 |
|
|
| ## Evaluation |
|
|
|
|
| ### Testing Data, Factors & Metrics |
|
|
| #### Testing Data |
|
|
| <!-- This should link to a Dataset Card if possible. --> |
|
|
| [More Information Needed] |
|
|
| #### Factors |
|
|
| <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. --> |
|
|
| [More Information Needed] |
|
|
| #### Metrics |
|
|
| TBD |
|
|
| ### Results |
|
|
| ``` |
| "train/learning_rate": 6.869747470432602e-7, |
| "train/loss": 0.562, |
| "train/global_step": 3000, |
| "eval/qwk": 0.5507, |
| "eval/loss": 0.9391, |
| "eval/accuracy": 0.6078, |
| "eval/balanced_accuracy": 0.3980, |
| "eval/f1_macro": 0.6006, |
| "eval/f1_class_0": 0.6211, |
| "eval/f1_class_1": 0.4932, |
| "eval/f1_class_2": 0.6875, |
| "eval/precision_macro": 0.6033, |
| "eval/precision_class_0": 0.6410, |
| "eval/precision_class_1": 0.5, |
| "eval/precision_class_2": 0.6689, |
| "eval/recall_macro": 0.5987, |
| "eval/recall_class_0": 0.6024, |
| "eval/recall_class_1": 0.4865, |
| "eval/recall_class_2": 0.7071, |
| "train_steps_per_second": 6.532, |
| ``` |
|
|
|
|
| #### Summary |
|
|
|
|
| ## Technical Specifications [optional] |
|
|
| ### Model Architecture and Objective |
|
|
| TBD |
|
|
| ### Compute Infrastructure |
|
|
| [More Information Needed] |
|
|
| #### Hardware |
|
|
| [More Information Needed] |
|
|
| #### Software |
|
|
| Transformers, Pytorch, Wandb for running the hyperparameter sweep |
|
|
| ## Citation |
|
|
| TBD |
|
|
|
|
|
|
| ## Model Card Authors |
|
|
| Catarina Belem (Cbelem) |
|
|
| ## Model Card Contact |
|
|
| For more information contact cbelem@uci.edu. |