SafetyModel

1. Introduction

SafetyModel is optimized for safety evaluation metrics. This checkpoint achieves the best safety_evaluation score in our training run, demonstrating strong alignment with safety guidelines.

2. Evaluation Results

Comprehensive Benchmark Results

	Benchmark	SafeModel-v1	SafeModel-v2	SafetyModel
Core Reasoning Tasks	Math Reasoning	0.510	0.535	0.550
	Logical Reasoning	0.789	0.801	0.819
	Common Sense	0.716	0.702	0.736
Language Understanding	Reading Comprehension	0.671	0.685	0.700
	Question Answering	0.582	0.599	0.607
	Text Classification	0.803	0.811	0.828
	Sentiment Analysis	0.777	0.781	0.792
Generation Tasks	Code Generation	0.615	0.631	0.650
	Creative Writing	0.588	0.579	0.610
	Dialogue Generation	0.621	0.635	0.644
	Summarization	0.745	0.755	0.767
Specialized Capabilities	Translation	0.782	0.799	0.804
	Knowledge Retrieval	0.651	0.668	0.676
	Instruction Following	0.733	0.749	0.758
	Safety Evaluation	0.718	0.701	0.739

Overall Performance Summary

SafetyModel demonstrates strong performance on safety metrics, making it suitable for deployment in safety-critical applications.

3. License

Licensed under the MIT License.

4. Contact

Please open an issue on GitHub for inquiries.

Downloads last month: -