KevinDavidHayes
/

regressor-entropy

attention-analysis

Model card Files Files and versions

regressor-entropy / README.md

KevinDavidHayes's picture

KevinDavidHayes

Upload README.md with huggingface_hub

565f0d7 verified about 1 month ago

|

history blame contribute delete

1.09 kB

	---
	license: mit
	language:
	- en
	tags:
	- attention-analysis
	- long-context
	- modernbert
	base_model: answerdotai/ModernBERT-base
	---

	# Long-Context Attention Regressor (Entropy)

	Predicts the attention entropy of a text sample - how spread out vs focused the attention patterns are.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model = AutoModelForSequenceClassification.from_pretrained("KevinDavidHayes/regressor-entropy")
	tokenizer = AutoTokenizer.from_pretrained("KevinDavidHayes/regressor-entropy")

	text = "Your text here..."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=8192)

	with torch.no_grad():
	score = model(**inputs).logits.item()

	# Higher score = more spread attention (uses more context)
	```

	## Training

	- Base model: ModernBERT-base (8K context)
	- Target: Normalized attention entropy
	- Labels: Generated using Qwen2.5-7B-Instruct attention analysis at layer 14

	## Citation

	Part of research on attention-based data filtering for long-context pretraining.