Academic Paper Classifier

A DistilBERT model fine-tuned to classify academic paper abstracts into arxiv subject categories. Given the abstract of a research paper, the model predicts which area of computer science or statistics the paper belongs to.

Intended Use

This model is designed for:

Automated paper triage -- quickly routing new submissions to the appropriate reviewers or reading lists.
Literature search -- filtering large collections of papers by predicted subject area.
Research tooling -- as a building block in larger academic-paper analysis pipelines.

The model is not intended for high-stakes decisions such as publication acceptance or funding allocation.

Labels

Id	Label	Description
0	cs.AI	Artificial Intelligence
1	cs.CL	Computation and Language (NLP)
2	cs.CV	Computer Vision
3	cs.LG	Machine Learning
4	cs.NE	Neural and Evolutionary Computing
5	cs.RO	Robotics
6	math.ST	Statistics Theory
7	stat.ML	Machine Learning (Statistics)

Training Procedure

Base Model

distilbert-base-uncased -- a distilled version of BERT that is 60% faster while retaining 97% of BERT's language-understanding performance.

Dataset

ccdv/arxiv-classification -- a curated collection of arxiv paper abstracts with subject category labels.

Hyperparameters

Parameter	Value
Learning rate	2e-5
LR scheduler	Linear with warmup
Warmup ratio	0.1
Weight decay	0.01
Epochs	5
Batch size (train)	16
Batch size (eval)	32
Max sequence length	512
Early stopping patience	3
Seed	42

Metrics

The model is evaluated on accuracy, weighted F1, weighted precision, and weighted recall. The best checkpoint is selected by weighted F1.

How to Use

With the `transformers` pipeline

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="gr8monk3ys/paper-classifier-model",
)

abstract = (
    "We introduce a new method for neural machine translation that uses "
    "attention mechanisms to align source and target sentences, achieving "
    "state-of-the-art results on WMT benchmarks."
)

result = classifier(abstract)
print(result)
# [{'label': 'cs.CL', 'score': 0.95}]

With the included inference script

python inference.py \
    --model_path gr8monk3ys/paper-classifier-model \
    --abstract "We propose a convolutional neural network for image recognition..."

Training from scratch

pip install -r requirements.txt

python train.py \
    --num_train_epochs 5 \
    --learning_rate 2e-5 \
    --per_device_train_batch_size 16 \
    --push_to_hub

Limitations

The model only covers a fixed set of 8 arxiv categories. Papers from other fields will be forced into one of these buckets.
Performance may degrade on abstracts that are unusually short, written in a language other than English, or that span multiple subject areas.
The model inherits any biases present in the DistilBERT base weights and in the training dataset.

Citation

If you use this model in your research, please cite:

@misc{scaturchio2025paperclassifier,
    title  = {Academic Paper Classifier},
    author = {Lorenzo Scaturchio},
    year   = {2025},
    url    = {https://huggingface.co/gr8monk3ys/paper-classifier-model}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for gr8monk3ys/paper-classifier

Base model

distilbert/distilbert-base-uncased

Finetuned

(11895)

this model

gr8monk3ys
/

paper-classifier

Academic Paper Classifier

Intended Use

Labels

Training Procedure

Base Model

Dataset

Hyperparameters

Metrics

How to Use

With the `transformers` pipeline

With the included inference script

Training from scratch

Limitations

Citation

Model tree for gr8monk3ys/paper-classifier

Dataset used to train gr8monk3ys/paper-classifier

Academic Paper Classifier

Intended Use

Labels

Training Procedure

Base Model

Dataset

Hyperparameters

Metrics

How to Use

With the transformers pipeline

With the included inference script

Training from scratch

Limitations

Citation

Model tree for gr8monk3ys/paper-classifier

Dataset used to train gr8monk3ys/paper-classifier

With the `transformers` pipeline