Sentence Interestingness Scorer

This model scores sentences by their "interestingness" - how compelling, notable, or attention-grabbing they are.

Model Description

  • Task: Sentence Scoring / Interestingness Ranking
  • Base Model: answerdotai/ModernBERT-large
  • Training Method: Contrastive learning with Bradley-Terry Model (Log Sigmoid Loss)
  • Architecture: Siamese network with shared encoder
  • Training Data: Wikipedia article sentences curated for interesting vs. less interesting pairs
  • Language: English

How It Works

The model was trained on pairs of sentences where one is more "interesting" than the other. Using contrastive learning, it learns to assign higher scores to more compelling sentences.

The model takes a context (e.g., article text) paired with a sentence and outputs a scalar score. Higher scores indicate more interesting/compelling sentences.

Usage

Scoring a Single Sentence

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("derenrich/sentence-scorer")
model = AutoModelForSequenceClassification.from_pretrained("derenrich/sentence-scorer")
model.eval()

# Context and sentence to score
context = "The Crash at Crush was a publicity stunt in Texas in 1896."
sentence = "An estimated 40,000 people attended the event."

# Tokenize
inputs = tokenizer(context, sentence, return_tensors="pt", truncation=True, max_length=384)

# Get score
with torch.no_grad():
    outputs = model(**inputs)
    score = outputs.logits.squeeze().item()

print(f"Interestingness score: {score:.4f}")

Ranking Sentences in a Passage

import nltk
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

nltk.download('punkt_tab')

tokenizer = AutoTokenizer.from_pretrained("derenrich/sentence-scorer")
model = AutoModelForSequenceClassification.from_pretrained("derenrich/sentence-scorer")
model.eval()

text = """
The Crash at Crush was a one-day publicity stunt in Texas that took place on 
September 15, 1896. Two uncrewed locomotives were crashed into each other 
head-on at high speed. An estimated 40,000 people attended the event. 
Unexpectedly, the impact caused both engine boilers to explode, resulting 
in a shower of flying debris that killed two people.
"""

# Split into sentences
sentences = nltk.sent_tokenize(text)

# Score each sentence
scores = []
for sentence in sentences:
    inputs = tokenizer(text, sentence, return_tensors="pt", truncation=True, max_length=384)
    with torch.no_grad():
        score = model(**inputs).logits.squeeze().item()
    scores.append((sentence, score))

# Rank by interestingness
ranked = sorted(scores, key=lambda x: x[1], reverse=True)

print("Sentences ranked by interestingness:")
for i, (sent, score) in enumerate(ranked, 1):
    print(f"{i}. [{score:.3f}] {sent}")

Training Details

  • Epochs: 3
  • Learning Rate: 2e-5
  • Batch Size: 8
  • Max Sequence Length: 384
  • Optimizer: AdamW with weight decay 0.01
  • Loss Function: Bradley-Terry Log Sigmoid Loss

Intended Use

This model is designed to:

  • Rank sentences by interestingness within a document
  • Identify the most compelling parts of an article
  • Help with content summarization by finding notable sentences
  • Generate "hook" text for content previews

Limitations

  • Works best on English text
  • Trained primarily on Wikipedia-style content
  • "Interestingness" is subjective; the model reflects training data biases
  • Scores are relative, not absolute measures

Citation

If you use this model, please cite:

@misc{sentence-scorer,
  title={Sentence Interestingness Scorer},
  author={Anonymous},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/derenrich/sentence-scorer}
}
Downloads last month
10
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for derenrich/sentence-scorer

Finetuned
(227)
this model

Space using derenrich/sentence-scorer 1