OpenSQZ
/

Qwen2.5-3B-classifier

Text Classification

quality-assessment

text-generation-inference

Model card Files Files and versions

Qwen2.5-3B-classifier / README.md

eruiner's picture

Upload README.md with huggingface_hub

1184440 verified 4 months ago

|

history blame contribute delete

2.52 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-1.5B
	- Qwen/Qwen2.5-3B
	task_categories:
	- text-classification
	language:
	- en
	- zh
	tags:
	- quality-assessment
	- text-quality
	- regression
	pipeline_tag: text-classification
	library_name: transformers
	---

	# Qwen2.5 Text Quality Classifier

	Fine-tuned Qwen2.5-1.5B and Qwen2.5-3B models for automated text quality assessment. Predicts quality scores on a 0-1 scale focusing on educational value and mathematical intelligence.

	## Model Details

	- Base Models: Qwen2.5-1.5B / Qwen2.5-3B
	- Task: Text Quality Regression
	- Languages: English, Chinese
	- Training Data: [OpenSQZ/Classifiers-Data](https://huggingface.co/datasets/OpenSQZ/Classifiers-Data)
	- Loss Function: MSE Loss

	## Performance

	\| Model \| Test MSE Loss \|
	\|-------\|---------------\|
	\| Qwen2.5-1.5B \| 0.00226 \|
	\| Qwen2.5-3B \| 0.00209 \|

	## Quick Start

	### Installation
	```bash
	pip install transformers torch
	```

	### Usage

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_name = "OpenSQZ/Qwen2.5-1.5B-Classifier" # or Qwen2.5-3B-Quality-Classifier
	model = AutoModelForSequenceClassification.from_pretrained(model_name)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	# Predict quality score
	text = "Linear algebra is fundamental to understanding vector spaces and matrix operations in mathematics."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=8192)

	with torch.no_grad():
	outputs = model(**inputs)
	score = torch.sigmoid(outputs.logits).item()

	print(f"Quality Score: {score:.3f}") # Output: Quality Score: 0.847
	```

	## Quality Score Interpretation

	\| Score Range \| Quality Level \| Use Case \|
	\|-------------\|---------------\|----------\|
	\| 0.8 - 1.0 \| Excellent \| Premium training data \|
	\| 0.6 - 0.8 \| Good \| Standard training data \|
	\| 0.4 - 0.6 \| Average \| Conditional use \|
	\| 0.0 - 0.4 \| Poor \| Filter out \|

	## Model Selection

	- 1.5B Model: Faster inference, good for real-time applications
	- 3B Model: Higher accuracy, better for batch processing

	## Limitations

	- Optimized for educational and mathematical content
	- May not generalize well to creative or subjective content
	- Scores should be used as guidance, not absolute judgments

	## Citation

	```bibtex
	@model{qwen25_quality_classifier_2025,
	title={Qwen2.5 Text Quality Classifier},
	author={Chao Li, Yifan Zhang},
	year={2025},
	publisher={OpenSQZ}
	}
	```

	## License

	Apache 2.0