metadata
license: mit
datasets:
- rajpurkar/squad
language:
- en
metrics:
- exact_match: 0.837
- f1: 0.911
- squad
base_model:
- FacebookAI/roberta-base
pipeline_tag: question-answering
library_name: transformers
tags:
- optoelectronics
- science
- data-mining
Model Card for Model ID
The OE-RoBERTa model is domain adapted from RoBERTa-base over research literature in optoelectronics. The adapted model is then fine-tuned on SQuAD v1.1 for question answering capabilities.
Model Details
Model Description
- Language(s) (NLP): English
- Finetuned from model: FacebookAI/roberta-base
Model Sources
- Repository: OptoelectronicsLM-codebase
- Paper: Cost-Efficient Domain-Adaptive Pretraining of Language Models for Optoelectronics Applications
Uses
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("question-answering", model="Dingyun-Huang/oe-roberta-base-squad1")
Citation
BibTeX:
@article{doi:10.1021/acs.jcim.4c02029,
author = {Huang, Dingyun and Cole, Jacqueline M.},
title = {Cost-Efficient Domain-Adaptive Pretraining of Language Models for Optoelectronics Applications},
journal = {Journal of Chemical Information and Modeling},
volume = {65},
number = {5},
pages = {2476-2486},
year = {2025},
doi = {10.1021/acs.jcim.4c02029},
note ={PMID: 39933074},
URL = {
https://doi.org/10.1021/acs.jcim.4c02029
},
eprint = {
https://doi.org/10.1021/acs.jcim.4c02029
}
}