--- license: apache-2.0 base_model: - Qwen/Qwen2.5-1.5B - Qwen/Qwen2.5-3B task_categories: - text-classification language: - en - zh tags: - quality-assessment - text-quality - regression pipeline_tag: text-classification library_name: transformers --- # Qwen2.5 Text Quality Classifier Fine-tuned Qwen2.5-1.5B and Qwen2.5-3B models for automated text quality assessment. Predicts quality scores on a 0-1 scale focusing on educational value and mathematical intelligence. ## Model Details - **Base Models**: Qwen2.5-1.5B / Qwen2.5-3B - **Task**: Text Quality Regression - **Languages**: English, Chinese - **Training Data**: [OpenSQZ/Classifiers-Data](https://huggingface.co/datasets/OpenSQZ/Classifiers-Data) - **Loss Function**: MSE Loss ## Performance | Model | Test MSE Loss | |-------|---------------| | Qwen2.5-1.5B | 0.00226 | | Qwen2.5-3B | 0.00209 | ## Quick Start ### Installation ```bash pip install transformers torch ``` ### Usage ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load model and tokenizer model_name = "OpenSQZ/Qwen2.5-1.5B-Classifier" # or Qwen2.5-3B-Quality-Classifier model = AutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) # Predict quality score text = "Linear algebra is fundamental to understanding vector spaces and matrix operations in mathematics." inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=8192) with torch.no_grad(): outputs = model(**inputs) score = torch.sigmoid(outputs.logits).item() print(f"Quality Score: {score:.3f}") # Output: Quality Score: 0.847 ``` ## Quality Score Interpretation | Score Range | Quality Level | Use Case | |-------------|---------------|----------| | 0.8 - 1.0 | Excellent | Premium training data | | 0.6 - 0.8 | Good | Standard training data | | 0.4 - 0.6 | Average | Conditional use | | 0.0 - 0.4 | Poor | Filter out | ## Model Selection - **1.5B Model**: Faster inference, good for real-time applications - **3B Model**: Higher accuracy, better for batch processing ## Limitations - Optimized for educational and mathematical content - May not generalize well to creative or subjective content - Scores should be used as guidance, not absolute judgments ## Citation ```bibtex @model{qwen25_quality_classifier_2025, title={Qwen2.5 Text Quality Classifier}, author={Chao Li, Yifan Zhang}, year={2025}, publisher={OpenSQZ} } ``` ## License Apache 2.0