Glazkov's picture
Upload ModernBERT entity infilling model - 2025-10-17 09:42:43
1febaec verified
---
license: mit
base_model: answerdotai/ModernBERT-base
tags:
- modernbert
- entity-infilling
- text-summarization
- masked-modeling
- pytorch
library_name: transformers
datasets:
- cnn_dailymail
model-index:
- name: Glazkov/sum-entity-infilling
results:
- task:
type: entity-infilling
name: Entity Infilling
dataset:
name: cnn_dailymail
type: cnn_dailymail
metrics:
- name: Entity Recall
type: entity_recall
value: TBD
---
# Glazkov/sum-entity-infilling
This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) trained on the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset for entity infilling tasks.
## Model Description
The model is designed to reconstruct masked entities in text using summary context. It was trained using a sequence-to-sequence approach where the model learns to predict original entities that have been replaced with `<mask>` tokens in the source text.
## Intended Uses & Limitations
**Intended Uses:**
- Entity reconstruction in summarization
- Text completion and infilling
- Research in masked language modeling
- Educational purposes
**Limitations:**
- Trained primarily on news article data
- May not perform well on highly technical or domain-specific content
- Performance varies with entity length and context
## Training Details
### Training Procedure
### Evaluation Results
The model was evaluated using entity recall metrics on a validation set from the CNN/DailyMail dataset.
**Metrics:**
- Entity Recall: Percentage of correctly reconstructed entities
- Token Accuracy: Token-level prediction accuracy
- Exact Match: Full sequence reconstruction accuracy
## Usage
```python
from transformers import AutoTokenizer, AutoModelForMaskedLM
from src.train.inference import EntityInfillingInference
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("your-username/Glazkov/sum-entity-infilling")
model = AutoModelForMaskedLM.from_pretrained("your-username/Glazkov/sum-entity-infilling")
# Initialize inference
inference = EntityInfillingInference(
model_path="your-username/Glazkov/sum-entity-infilling",
device="cuda" # or "cpu"
)
# Example inference
summary = "Membership gives the ICC jurisdiction over alleged crimes..."
masked_text = "(<mask> officially became the 123rd member of the International Criminal Court..."
predictions = inference.predict_masked_entities(
summary=summary,
masked_text=masked_text
)
```
## Training Configuration
This model was trained using the following configuration:
- Base Model: answerdotai/ModernBERT-base
- Dataset: cnn_dailymail
- Task: Entity Infilling
- Framework: PyTorch with Accelerate
- Training Date: 2025-10-17
For more details about the training process, see the [training configuration](training_config.txt) file.
## Model Architecture
The model uses ModernBERT architecture with:
- 12 transformer layers
- Hidden size: 768
- Vocabulary: Custom with `<mask>` token support
- Maximum sequence length: 512 tokens
## Acknowledgments
- [Hugging Face Transformers](https://github.com/huggingface/transformers) for the model architecture
- [CNN/DailyMail dataset](https://huggingface.co/datasets/cnn_dailymail) for training data
- [Answer.AI](https://huggingface.co/answerdotai) for the ModernBERT base model
## License
This model is licensed under the MIT License.