Upload ModernBERT entity infilling model - 2025-10-17 09:42:43

1febaec verified 5 months ago

3.44 kB

	---
	license: mit
	base_model: answerdotai/ModernBERT-base
	tags:
	- modernbert
	- entity-infilling
	- text-summarization
	- masked-modeling
	- pytorch
	library_name: transformers
	datasets:
	- cnn_dailymail
	model-index:
	- name: Glazkov/sum-entity-infilling
	results:
	- task:
	type: entity-infilling
	name: Entity Infilling
	dataset:
	name: cnn_dailymail
	type: cnn_dailymail
	metrics:
	- name: Entity Recall
	type: entity_recall
	value: TBD
	---

	# Glazkov/sum-entity-infilling

	This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) trained on the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset for entity infilling tasks.

	## Model Description

	The model is designed to reconstruct masked entities in text using summary context. It was trained using a sequence-to-sequence approach where the model learns to predict original entities that have been replaced with `<mask>` tokens in the source text.

	## Intended Uses & Limitations

	Intended Uses:
	- Entity reconstruction in summarization
	- Text completion and infilling
	- Research in masked language modeling
	- Educational purposes

	Limitations:
	- Trained primarily on news article data
	- May not perform well on highly technical or domain-specific content
	- Performance varies with entity length and context

	## Training Details

	### Training Procedure


	### Evaluation Results
	The model was evaluated using entity recall metrics on a validation set from the CNN/DailyMail dataset.

	Metrics:
	- Entity Recall: Percentage of correctly reconstructed entities
	- Token Accuracy: Token-level prediction accuracy
	- Exact Match: Full sequence reconstruction accuracy

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForMaskedLM
	from src.train.inference import EntityInfillingInference

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained("your-username/Glazkov/sum-entity-infilling")
	model = AutoModelForMaskedLM.from_pretrained("your-username/Glazkov/sum-entity-infilling")

	# Initialize inference
	inference = EntityInfillingInference(
	model_path="your-username/Glazkov/sum-entity-infilling",
	device="cuda" # or "cpu"
	)

	# Example inference
	summary = "Membership gives the ICC jurisdiction over alleged crimes..."
	masked_text = "(<mask> officially became the 123rd member of the International Criminal Court..."

	predictions = inference.predict_masked_entities(
	summary=summary,
	masked_text=masked_text
	)
	```

	## Training Configuration

	This model was trained using the following configuration:
	- Base Model: answerdotai/ModernBERT-base
	- Dataset: cnn_dailymail
	- Task: Entity Infilling
	- Framework: PyTorch with Accelerate
	- Training Date: 2025-10-17

	For more details about the training process, see the [training configuration](training_config.txt) file.

	## Model Architecture

	The model uses ModernBERT architecture with:
	- 12 transformer layers
	- Hidden size: 768
	- Vocabulary: Custom with `<mask>` token support
	- Maximum sequence length: 512 tokens

	## Acknowledgments

	- [Hugging Face Transformers](https://github.com/huggingface/transformers) for the model architecture
	- [CNN/DailyMail dataset](https://huggingface.co/datasets/cnn_dailymail) for training data
	- [Answer.AI](https://huggingface.co/answerdotai) for the ModernBERT base model

	## License

	This model is licensed under the MIT License.