jeergrvgreg/uplifting-filter-v5

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B for multi-dimensional content scoring using the uplifting filter.

The model was trained using knowledge distillation from Gemini Flash, learning to replicate its judgment patterns on content evaluation.

Filter Focus: DOCUMENTED OUTCOMES for human/planetary wellbeing, not emotional tone or speculation

Intended Use

This model scores articles across 6 semantic dimensions:

  • Human Wellbeing Impact (weight: 0.25): Improvement in health, safety, livelihoods, or basic needs
  • Social Cohesion Impact (weight: 0.15): Communities strengthened, solidarity built, connections across groups
  • Justice Rights Impact (weight: 0.10): Wrongs addressed, accountability achieved, rights expanded
  • Evidence Level (weight: 0.20): How verified are the claimed outcomes?
  • Benefit Distribution (weight: 0.20): Who benefits? How accessible is the benefit?
  • Change Durability (weight: 0.10): How lasting is the change?

Training Data

  • Training samples: 7,999
  • Validation samples: 1,000
  • Oracle: Gemini Flash (for ground truth generation)
  • Quality threshold: Articles with quality_score >= 0.7

Training Procedure

Model Architecture

  • Base model: Qwen/Qwen2.5-1.5B
  • Parameters: 1,562,197,504
  • Task: Multi-dimensional regression (8 outputs)
  • Input: Article title + content (max 512 tokens)
  • Output: 8 continuous scores (0-10 range)

Training Configuration

  • Epochs: 3
  • Batch size: 8
  • Learning rate: 2e-05
  • Optimizer: AdamW
  • Loss function: Mean Squared Error (MSE)
  • Gradient checkpointing: Enabled

Performance

Overall Metrics

Metric Value
Validation MAE 0.6807
Training MAE 0.6368
Validation RMSE 0.8799
Training RMSE 0.8215

Per-Dimension Performance (Validation MAE)

Dimension MAE
Human Wellbeing Impact 0.6857
Social Cohesion Impact 0.7040
Justice Rights Impact 0.6188
Evidence Level 0.6363
Benefit Distribution 0.7922
Change Durability 0.6475

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "jeergrvgreg/uplifting-filter-v5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input
article = {
    "title": "Example Article Title",
    "content": "Article content here..."
}

text = f"{article['title']}\n\n{article['content']}"
inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    scores = outputs.logits[0].numpy()

# Dimension names
dimensions = ['human_wellbeing_impact', 'social_cohesion_impact', 'justice_rights_impact', 'evidence_level', 'benefit_distribution', 'change_durability']

# Print scores
for dim, score in zip(dimensions, scores):
    print(f"{dim}: {score:.2f}")

Limitations

  • Model was trained on English news articles
  • Performance may vary on other content types
  • Validation MAE of 0.6807 indicates ~0.8 point average error on 0-10 scale
  • Some overfitting observed (train/val gap: 0.04)

Ethical Considerations

This model evaluates content based on specific semantic dimensions. Users should:

  • Understand the filter's focus and biases
  • Not use as sole decision-maker for content moderation
  • Regularly evaluate model performance on their specific use case
  • Be aware that automated scoring may miss nuance

Citation

If you use this model, please cite:

@misc{uplifting_filter_v5.0,
  title={Uplifting Content Filter},
  author={Your Name},
  year={2025},
  url={https://huggingface.co/jeergrvgreg/uplifting-filter-v5}
}

Model Card Contact

For questions or feedback about this model, please open an issue in the repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support