init
Browse files# ModernBERT-base-cos
## Model Description
ModernBERT-base-cos is a ModernBERT-based sequence classification model specifically fine-tuned to assess the quality of summaries in a QnA context. This model is designed to evaluate how well a generated summary captures essential information needed for question-answering tasks as part of research on the "chain of summaries" approach.
## Intended Use
This model evaluates the quality and completeness of summaries by providing a quality score. It helps determine whether a summary adequately captures the information needed for downstream QnA tasks, making it useful for:
- Researchers working on summarization evaluation
- QnA pipeline optimization
- Educational applications requiring assessment of student-generated summaries
- Content creation platforms where summary quality is important
## Model Details
- **Model Type**: ModernBERT-based sequence classification model with sigmoid activation for multi-label classification
- **Version**: 1.0
- **License**: [Add your preferred license]
- **Developed by**: [Your name/organization]
- **Base Model**: ModernBERT-base architecture with multi-label-classification head
- **Language**: English
- **Training Data**: [Brief description of your training data - e.g., "Paired texts and summaries with quality annotations"]
- **Input**: Text summaries
- **Output**: Quality scores between 0 and 1 for multiple quality dimensions
## Limitations
- The model is optimized for English text only
- Performance may decrease for very specialized domains not represented in the training data
- The model evaluates summaries in the context of QnA specifically, not general summarization quality
- Maximum input length is limited by BERT's standard constraints (512 tokens)
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_id = "williambrach/ModernBERT-base-cos"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id).to("cuda")
def summary_score(
tokenizer,
summaries: list[str],
device: str = "cuda",
return_tensor: bool = True,
):
inputs = tokenizer(
summaries, return_tensors="pt", padding=True, truncation=True
).to(device)
with torch.no_grad():
outputs = model(**inputs)
logits = torch.sigmoid(outputs["logits"])
if return_tensor:
logits = logits
else:
logits = logits.cpu().numpy().tolist()
return logits
# Example
texts = [
"test",
]
scores = summary_score(tokenizer, texts, return_tensor=False)
print(scores)
```
## References and Citations
This model is part of research on the chain of summaries approach for QnA tasks. If you use this model in your research, please cite:
[Your citation information]