ReviewMate Rhetorical Sentence Classifier — v1

A DistilBERT model that classifies sentences from research paper abstracts into five rhetorical roles. Optimized for empirical scientific writing across STEM domains, with strong performance on computer science / AI literature.

Task

Given a sentence from a research abstract, predict its rhetorical role:

BACKGROUND — context, prior work, motivation
OBJECTIVE — research goal or hypothesis
METHODS — approach, design, techniques used
RESULTS — findings, benchmarks, measurements
CONCLUSIONS — interpretation, implications, future work

These rhetorical categories apply to empirical scientific writing across most STEM domains.

Performance

Evaluated on held-out test sets:

Test Set	Domain	Accuracy	F1 Macro	F1 Weighted
CSAbstruct	CS / AI	0.7517	0.7556	0.7527
PubMed-RCT	Biomedical	0.8833	0.8294	0.8825

Best suited for computer science, AI/ML, applied STEM, and quantitative empirical research. Performance may degrade on theoretical, mathematical, or humanities papers due to different rhetorical conventions.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "Himel000/reviewmate-classifier-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

sentence = "We propose a novel transformer architecture for sequence classification."
inputs = tokenizer(sentence, return_tensors="pt", truncation=True, max_length=64)

with torch.no_grad():
    outputs = model(**inputs)

predicted_class_id = outputs.logits.argmax(dim=-1).item()
predicted_label = model.config.id2label[predicted_class_id]
print(predicted_label)

Training Approach

The model was fine-tuned in two sequential stages to leverage large-scale rhetorical data while adapting to target domain writing style.

Stage 1 — Large-scale rhetorical learning:

Base: distilbert-base-uncased
Data: PubMed-RCT-200k (~2.2M labeled sentences)
Epochs: 2, learning rate: 5e-5, effective batch size: 128
Purpose: Learn general rhetorical patterns from the largest available labeled dataset

Stage 2 — Domain adaptation:

Continued fine-tuning from Stage 1 weights
Data: CSAbstruct (~11k labeled CS sentences)
Epochs: 4, learning rate: 2e-5 (low to preserve Stage 1 knowledge), batch size: 32
Purpose: Adapt rhetorical patterns to CS/AI writing conventions

Total compute: ~4 hours (Kaggle T4 x2).

The two-stage approach improved out-of-domain F1 from 0.39 (after Stage 1 alone) to 0.76, demonstrating the value of staged transfer learning for cross-domain rhetorical classification.

Project Context

This model powers the rhetorical extraction component of ReviewMate, an AI-powered literature scanning and synthesis tool for empirical research papers.

Scope and Limitations

Optimized for empirical scientific writing (introduction → method → result → conclusion structure)
Strongest on CS, AI/ML, applied STEM, and quantitative empirical research
Not designed for theoretical mathematics (theorem-proof structure), humanities, or non-empirical writing
Single-sentence classification (no surrounding sentence context)
Cross-domain expansion to additional fields is part of the project's open contribution roadmap

Citation

Datasets used for training:

Dernoncourt, F., & Lee, J. Y. (2017). PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts. IJCNLP 2017.

Cohan, A., Beltagy, I., King, D., Dalvi, B., & Weld, D. (2019). Pretrained Language Models for Sequential Sentence Classification. EMNLP 2019.

Downloads last month: 18

Safetensors

Model size

67M params

Tensor type

F32

Model tree for Himel000/reviewmate-classifier-v1

Base model

distilbert/distilbert-base-uncased

Finetuned

(11930)

this model

Himel000
/

reviewmate-classifier-v1