ScientificTextClassification_ResearchField
π Overview
This is a RoBERTa-base model fine-tuned for the complex task of multi-class classification of scientific article abstracts. The model predicts the primary research field (e.g., Physics, Biology, Computer Science) based solely on the abstract text, serving as a powerful tool for automated journal indexing and literature review organization.
π§ Model Architecture
The choice of RoBERTa ensures enhanced robustness and better handling of long-range dependencies common in technical and scientific prose.
- Base Model:
roberta-base(an optimized BERT approach without the next-sentence prediction objective). - Classification Head: Outputs 8 distinct categories (
num_labels: 8). - Input Data: Detailed scientific abstracts from diverse journals.
- Output: A probability distribution over the 8 classes: Physics, Chemistry, Medicine, Computer Science, Biology, Geoscience, Materials Science, and Engineering.
- Training Dataset: ScientificArticleAbstract_Classification, providing abstracts linked to their high-level research disciplines.
π― Intended Use
The model offers utility in several scientific and information retrieval contexts:
- Automated Library and Repository Indexing: Rapidly and accurately tagging new publications with their correct discipline.
- Literature Review Automation: Filtering large databases of articles to focus on specific fields.
- Grant Proposal Routing: Assisting research institutions in routing incoming proposals to the appropriate review panel or expert based on the summary.
- Trend Analysis: Tracking the volume and convergence of research across different fields.
β οΈ Limitations
- Interdisciplinary Papers: The model performs single-label classification. It may struggle with highly interdisciplinary abstracts that bridge two or more distinct fields (e.g., computational chemistry or bio-engineering).
- Vocabulary Drift: Scientific terminology evolves quickly. New sub-disciplines or extremely novel concepts may not be classified correctly until the model is retrained.
- Class Imbalance: If the underlying distribution of the eight fields in the real world shifts significantly from the training set, performance may vary.
MODEL 3: EcommerceAspectSentiment_BART
This model is a BART-large sequence-to-sequence model fine-tuned for abstractive multi-aspect sentiment summarization based on Dataset 3 (EcommerceCustomerReview_MultiAspectRating).
config.json
{
"_name_or_path": "facebook/bart-large",
"architectures": [
"BartForConditionalGeneration"
],
"model_type": "bart",
"vocab_size": 50265,
"d_model": 1024,
"encoder_layers": 12,
"decoder_layers": 12,
"encoder_attention_heads": 16,
"decoder_attention_heads": 16,
"encoder_ffn_dim": 4096,
"decoder_ffn_dim": 4096,
"dropout": 0.1,
"activation_function": "gelu",
"init_std": 0.02,
"num_labels": 3,
"max_position_embeddings": 1024,
"eos_token_id": 2,
"bos_token_id": 0,
"pad_token_id": 1,
"is_encoder_decoder": true,
"scale_embedding": false,
"forced_eos_token_id": 2,
"transformers_version": "4.35.2"
}
- Downloads last month
- 12
Evaluation results
- Accuracy (Top-1)self-reported0.941
- Macro F1 Scoreself-reported0.935