nicolauduran45's picture
Update README.md
141ba80 verified
---
library_name: transformers
license: apache-2.0
base_model: allenai/specter2_base
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: results
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# 📗 SPECTER2–MAG (Multiclass Classification on MAG Level-0 Fields of Study)
This model is a fine-tuned version of [allenai/specter2_base](https://huggingface.co/allenai/specter2_base) for multiclass bibliometric classification using MAG Fields of Study – Level 0 (SciDocs).
It achieves the following results on the evaluation set:
- Loss: 1.0598
- Accuracy: 0.8310
- Precision Micro: 0.8310
- Precision Macro: 0.8290
- Recall Micro: 0.8310
- Recall Macro: 0.8276
- F1 Micro: 0.8310
- F1 Macro: 0.8263
## Model description
This model is a fine-tuned version of SPECTER2 (`allenai/specter2_base`) adapted for multiclass classification across the 19 top-level Fields of Study (FoS) from the Microsoft Academic Graph (MAG).
The model accepts the title, abstract, or title + abstract of a scientific publication and assigns it to exactly one of the MAG Level-0 domains (e.g., Biology, Chemistry, Computer Science, Engineering, Psychology).
Key characteristics:
* Base model: allenai/specter2_base
* Task: multiclass document classification
* Labels: 19 MAG Field of Study Level-0 categories
* Activation: softmax
* Loss: CrossEntropyLoss
* Output: single best-matching FoS category
MAG Level-0 represents broad disciplinary domains designed for high-level categorization of scientific documents.
## Intended uses & limitations
### Intended uses
This multiclass MAG model is suitable for:
- Assigning publications to **top-level scientific disciplines**
- Enriching metadata in:
- repositories
- research output systems
- funding and project datasets
- bibliometric dashboards
- Supporting scientometric analyses such as:
- broad-discipline portfolio mapping
- domain-level clustering
- modeling research diversification
- Classifying documents when only **title/abstract** is available
The model supports inputs such as:
- **title only**
- **abstract only**
- **title + abstract** (recommended)
### Limitations
- MAG Level-0 categories are **very coarse** (e.g., *Biology*, *Medicine*, *Engineering*), and do not represent subfields.
- Documents spanning multiple fields must be forced into **one** label—an inherent limitation of multiclass classification.
- The training labels come from **MAG’s automatic field assignment pipeline**, not manual expert annotation.
- Not suitable for:
- fine-grained subdisciplines
- downstream tasks requiring multilabel outputs
- WoS Categories or ASJC Areas (use separate models)
- clinical or regulatory decision-making
Predictions should be treated as **high-level disciplinary metadata**, not detailed field classification.
## Training and evaluation data
### Source dataset: **SciDocs**
Training data comes from the **SciDocs** dataset, introduced together with the original SPECTER paper:
> **SciDocs** provides citation graphs, titles, abstracts, and **MAG Fields of Study** for scientific documents derived from MAG.
> For this model, we use **MAG Level-0 FoS**, the 19 top-level scientific domains.
Dataset characteristics:
| Property | Value |
|---------|-------|
| Documents | ~40k scientific papers |
| Labels | 19 FoS Level-0 categories |
| Input fields | Abstract |
| Task type | Multiclass |
| Source | SciDocs (SPECTER paper) |
| License | CC-BY |
## Training procedure
### Preprocessing
- Input text constructed as:
`abstract`
- Tokenization using the SPECTER2 tokenizer
- Maximum sequence length: **512 tokens**
### Model
- Base model: `allenai/specter2_base`
- Classification head: linear layer → softmax
- Loss: **CrossEntropyLoss**
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision Micro | Precision Macro | Recall Micro | Recall Macro | F1 Micro | F1 Macro |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------------:|:---------------:|:------------:|:------------:|:--------:|:--------:|
| 0.2603 | 1.0 | 1094 | 0.6733 | 0.8243 | 0.8243 | 0.8315 | 0.8243 | 0.8198 | 0.8243 | 0.8222 |
| 0.1779 | 2.0 | 2188 | 0.6955 | 0.8240 | 0.8240 | 0.8198 | 0.8240 | 0.8203 | 0.8240 | 0.8176 |
| 0.1628 | 3.0 | 3282 | 0.8130 | 0.8315 | 0.8315 | 0.8296 | 0.8315 | 0.8265 | 0.8315 | 0.8269 |
| 0.1136 | 4.0 | 4376 | 0.9842 | 0.8227 | 0.8227 | 0.8254 | 0.8227 | 0.8192 | 0.8227 | 0.8205 |
| 0.0666 | 5.0 | 5470 | 1.0598 | 0.8310 | 0.8310 | 0.8290 | 0.8310 | 0.8276 | 0.8310 | 0.8263 |
### Evaluation results
| | precision | recall | f1-score | support |
|:----------------------|------------:|---------:|-----------:|------------:|
| Art | 0.654867 | 0.845714 | 0.738155 | 175 |
| Biology | 0.982222 | 0.973568 | 0.977876 | 227 |
| Business | 0.914894 | 0.877551 | 0.895833 | 196 |
| Chemistry | 0.97449 | 0.969543 | 0.97201 | 197 |
| Computer science | 0.960452 | 0.894737 | 0.926431 | 190 |
| Economics | 0.816425 | 0.782407 | 0.799054 | 216 |
| Engineering | 0.906103 | 0.927885 | 0.916865 | 208 |
| Environmental science | 0.975369 | 0.916667 | 0.945107 | 216 |
| Geography | 0.758454 | 0.912791 | 0.828496 | 172 |
| Geology | 0.96729 | 0.976415 | 0.971831 | 212 |
| History | 0.62987 | 0.518717 | 0.568915 | 187 |
| Materials science | 0.932432 | 0.958333 | 0.945205 | 216 |
| Mathematics | 0.938776 | 0.94359 | 0.941176 | 195 |
| Medicine | 0.982558 | 0.923497 | 0.952113 | 183 |
| Philosophy | 0.752874 | 0.748571 | 0.750716 | 175 |
| Physics | 0.964824 | 0.974619 | 0.969697 | 197 |
| Political science | 0.642512 | 0.661692 | 0.651961 | 201 |
| Psychology | 0.806283 | 0.758621 | 0.781726 | 203 |
| Sociology | 0.438889 | 0.427027 | 0.432877 | 185 |
| accuracy | 0.845641 | 0.845641 | 0.845641 | 0.845641 |
| macro avg | 0.842083 | 0.841681 | 0.840318 | 3751 |
| weighted avg | 0.847843 | 0.845641 | 0.845311 | 3751 |
### Framework versions
- Transformers 4.57.1
- Pytorch 2.8.0+cu126
- Datasets 3.6.0
- Tokenizers 0.22.1