jeergrvgreg/uplifting-filter-v5

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B for multi-dimensional content scoring using the uplifting filter.

The model was trained using knowledge distillation from Gemini Flash, learning to replicate its judgment patterns on content evaluation.

Filter Focus: DOCUMENTED OUTCOMES for human/planetary wellbeing, not emotional tone or speculation

Intended Use

This model scores articles across 6 semantic dimensions:

Human Wellbeing Impact (weight: 0.25): Improvement in health, safety, livelihoods, or basic needs
Social Cohesion Impact (weight: 0.15): Communities strengthened, solidarity built, connections across groups
Justice Rights Impact (weight: 0.10): Wrongs addressed, accountability achieved, rights expanded
Evidence Level (weight: 0.20): How verified are the claimed outcomes?
Benefit Distribution (weight: 0.20): Who benefits? How accessible is the benefit?
Change Durability (weight: 0.10): How lasting is the change?

Training Data

Training samples: 7,999
Validation samples: 1,000
Oracle: Gemini Flash (for ground truth generation)
Quality threshold: Articles with quality_score >= 0.7

Training Procedure

Model Architecture

Base model: Qwen/Qwen2.5-1.5B
Parameters: 1,562,197,504
Task: Multi-dimensional regression (8 outputs)
Input: Article title + content (max 512 tokens)
Output: 8 continuous scores (0-10 range)

Training Configuration

Epochs: 3
Batch size: 8
Learning rate: 2e-05
Optimizer: AdamW
Loss function: Mean Squared Error (MSE)
Gradient checkpointing: Enabled

Performance

Overall Metrics

Metric	Value
Validation MAE	0.6807
Training MAE	0.6368
Validation RMSE	0.8799
Training RMSE	0.8215

Per-Dimension Performance (Validation MAE)

Dimension	MAE
Human Wellbeing Impact	0.6857
Social Cohesion Impact	0.7040
Justice Rights Impact	0.6188
Evidence Level	0.6363
Benefit Distribution	0.7922
Change Durability	0.6475

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "jeergrvgreg/uplifting-filter-v5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Prepare input
article = {
    "title": "Example Article Title",
    "content": "Article content here..."
}

text = f"{article['title']}\n\n{article['content']}"
inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    scores = outputs.logits[0].numpy()

# Dimension names
dimensions = ['human_wellbeing_impact', 'social_cohesion_impact', 'justice_rights_impact', 'evidence_level', 'benefit_distribution', 'change_durability']

# Print scores
for dim, score in zip(dimensions, scores):
    print(f"{dim}: {score:.2f}")

Limitations

Model was trained on English news articles
Performance may vary on other content types
Validation MAE of 0.6807 indicates ~0.8 point average error on 0-10 scale
Some overfitting observed (train/val gap: 0.04)

Ethical Considerations

This model evaluates content based on specific semantic dimensions. Users should:

Understand the filter's focus and biases
Not use as sole decision-maker for content moderation
Regularly evaluate model performance on their specific use case
Be aware that automated scoring may miss nuance

Citation

If you use this model, please cite:

@misc{uplifting_filter_v5.0,
  title={Uplifting Content Filter},
  author={Your Name},
  year={2025},
  url={https://huggingface.co/jeergrvgreg/uplifting-filter-v5}
}

Model Card Contact

For questions or feedback about this model, please open an issue in the repository.

Downloads last month: -; Downloads are not tracked for this model. How to track