File size: 7,923 Bytes

---
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: section-classifier-imrad
  results: []
datasets:
- saier/unarXive_imrad_clf
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# section-classifier-imrad

This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on [saier/unarXive_imrad_clf](https://huggingface.co/datasets/saier/unarXive_imrad_clf) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6404
- Accuracy: 0.7714
- F1: 0.7760
- Precision: 0.7891
- Recall: 0.7714

## Model description

This model classifies scientific paper sections into IMRaD categories (Introduction, Methods, Results, and Discussion). It's a fine-tuned version of DistilBERT trained on the unarXive dataset with weighted cross-entropy loss to handle class imbalance.

## Intended uses & limitations

Intended use: Automatically categorizing sections in academic papers, particularly arXiv submissions.

Limitations: Trained exclusively on arXiv papers; may not generalize well to non-academic text or from other domains. Requires text segments of reasonable length (up to 512 tokens).

## Training and evaluation data

Trained on saier/unarXive_imrad_clf, a dataset of labeled paper sections from arXiv. The model uses weighted class balancing to account for label distribution imbalance across the five IMRaD categories.

## How to use

```
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "your-username/section-classifier-imrad"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

texts = [
    "In this paper, we propose a new method for retrieval.",
    "We evaluate on three benchmarks and report state-of-the-art results."
]

inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits

pred_ids = torch.argmax(logits, dim=-1).tolist()
id2label = model.config.id2label

for t, i in zip(texts, pred_ids):
    print(id2label[i], ":", t)
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 3

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
|:-------------:|:------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
| 1.5822        | 0.0062 | 100  | 1.5712          | 0.3785   | 0.3802 | 0.5100    | 0.3785 |
| 1.2276        | 0.0123 | 200  | 1.1797          | 0.3953   | 0.3201 | 0.5746    | 0.3953 |
| 1.0040        | 0.0185 | 300  | 1.0034          | 0.5159   | 0.5110 | 0.6109    | 0.5159 |
| 0.8683        | 0.0246 | 400  | 0.8951          | 0.5797   | 0.5820 | 0.6596    | 0.5797 |
| 0.9856        | 0.0308 | 500  | 0.8343          | 0.6607   | 0.6648 | 0.6940    | 0.6607 |
| 0.8125        | 0.0369 | 600  | 0.8134          | 0.6559   | 0.6609 | 0.7009    | 0.6559 |
| 0.8667        | 0.0431 | 700  | 0.7731          | 0.6905   | 0.6956 | 0.7283    | 0.6905 |
| 0.8262        | 0.0492 | 800  | 0.7533          | 0.6881   | 0.6957 | 0.7343    | 0.6881 |
| 0.7894        | 0.0554 | 900  | 0.7523          | 0.6379   | 0.6419 | 0.7273    | 0.6379 |
| 0.7810        | 0.0615 | 1000 | 0.7639          | 0.6919   | 0.7023 | 0.7349    | 0.6919 |
| 0.7102        | 0.0677 | 1100 | 0.7708          | 0.7163   | 0.7207 | 0.7467    | 0.7163 |
| 0.6794        | 0.0738 | 1200 | 0.7344          | 0.7057   | 0.7147 | 0.7469    | 0.7057 |
| 0.7838        | 0.0800 | 1300 | 0.7484          | 0.7133   | 0.7188 | 0.7467    | 0.7133 |
| 0.7457        | 0.0861 | 1400 | 0.7024          | 0.6845   | 0.6910 | 0.7501    | 0.6845 |
| 0.6696        | 0.0923 | 1500 | 0.7355          | 0.6763   | 0.6867 | 0.7516    | 0.6763 |
| 0.5735        | 0.0984 | 1600 | 0.7082          | 0.7231   | 0.7305 | 0.7575    | 0.7231 |
| 0.7231        | 0.1046 | 1700 | 0.6850          | 0.7253   | 0.7303 | 0.7529    | 0.7253 |
| 0.7180        | 0.1108 | 1800 | 0.7049          | 0.7039   | 0.7120 | 0.7554    | 0.7039 |
| 0.7093        | 0.1169 | 1900 | 0.7192          | 0.6841   | 0.6919 | 0.7533    | 0.6841 |
| 0.6047        | 0.1231 | 2000 | 0.6679          | 0.7407   | 0.7459 | 0.7639    | 0.7407 |
| 0.6954        | 0.1292 | 2100 | 0.7083          | 0.7237   | 0.7329 | 0.7616    | 0.7237 |
| 0.6577        | 0.1354 | 2200 | 0.6808          | 0.7215   | 0.7278 | 0.7583    | 0.7215 |
| 0.6743        | 0.1415 | 2300 | 0.6904          | 0.7251   | 0.7338 | 0.7682    | 0.7251 |
| 0.5870        | 0.1477 | 2400 | 0.6747          | 0.7217   | 0.7301 | 0.7728    | 0.7217 |
| 0.6079        | 0.1538 | 2500 | 0.6609          | 0.7502   | 0.7563 | 0.7745    | 0.7502 |
| 0.5927        | 0.1600 | 2600 | 0.6757          | 0.7485   | 0.7544 | 0.7698    | 0.7485 |
| 0.6936        | 0.1661 | 2700 | 0.6970          | 0.7548   | 0.7606 | 0.7769    | 0.7548 |
| 0.7466        | 0.1723 | 2800 | 0.6619          | 0.7401   | 0.7475 | 0.7726    | 0.7401 |
| 0.7301        | 0.1784 | 2900 | 0.6474          | 0.7337   | 0.7404 | 0.7691    | 0.7337 |
| 0.6256        | 0.1846 | 3000 | 0.6474          | 0.7381   | 0.7456 | 0.7733    | 0.7381 |
| 0.7141        | 0.1907 | 3100 | 0.7102          | 0.7231   | 0.7360 | 0.7727    | 0.7231 |
| 0.6770        | 0.1969 | 3200 | 0.6436          | 0.7177   | 0.7233 | 0.7651    | 0.7177 |
| 0.7148        | 0.2031 | 3300 | 0.6410          | 0.7493   | 0.7560 | 0.7775    | 0.7493 |
| 0.6010        | 0.2092 | 3400 | 0.6683          | 0.7626   | 0.7667 | 0.7773    | 0.7626 |
| 0.7568        | 0.2154 | 3500 | 0.6563          | 0.7590   | 0.7660 | 0.7836    | 0.7590 |
| 0.6437        | 0.2215 | 3600 | 0.6377          | 0.7419   | 0.7504 | 0.7839    | 0.7419 |
| 0.7817        | 0.2277 | 3700 | 0.6439          | 0.7487   | 0.7560 | 0.7814    | 0.7487 |
| 0.6606        | 0.2338 | 3800 | 0.6534          | 0.7534   | 0.7603 | 0.7821    | 0.7534 |
| 0.6466        | 0.2400 | 3900 | 0.6859          | 0.7063   | 0.7167 | 0.7661    | 0.7063 |
| 0.6616        | 0.2461 | 4000 | 0.6461          | 0.7217   | 0.7307 | 0.7775    | 0.7217 |
| 0.6033        | 0.2523 | 4100 | 0.6394          | 0.7419   | 0.7490 | 0.7761    | 0.7419 |
| 0.6647        | 0.2584 | 4200 | 0.6229          | 0.7680   | 0.7722 | 0.7833    | 0.7680 |
| 0.7093        | 0.2646 | 4300 | 0.6309          | 0.7419   | 0.7488 | 0.7752    | 0.7419 |
| 0.6773        | 0.2707 | 4400 | 0.6342          | 0.7594   | 0.7651 | 0.7817    | 0.7594 |
| 0.6944        | 0.2769 | 4500 | 0.6363          | 0.7522   | 0.7588 | 0.7821    | 0.7522 |
| 0.5588        | 0.2830 | 4600 | 0.6503          | 0.7431   | 0.7516 | 0.7838    | 0.7431 |
| 0.6522        | 0.2892 | 4700 | 0.6412          | 0.7526   | 0.7589 | 0.7783    | 0.7526 |
| 0.6321        | 0.2953 | 4800 | 0.6569          | 0.7666   | 0.7727 | 0.7914    | 0.7666 |
| 0.6983        | 0.3015 | 4900 | 0.6327          | 0.7339   | 0.7414 | 0.7767    | 0.7339 |
| 0.6051        | 0.3077 | 5000 | 0.6754          | 0.7229   | 0.7340 | 0.7752    | 0.7229 |
| 0.7185        | 0.3138 | 5100 | 0.6220          | 0.7532   | 0.7590 | 0.7809    | 0.7532 |
| 0.7003        | 0.3200 | 5200 | 0.6200          | 0.7413   | 0.7479 | 0.7788    | 0.7413 |


### Framework versions

- Transformers 5.3.0
- Pytorch 2.10.0+cu128
- Datasets 4.6.1
- Tokenizers 0.22.2