AssistantModel

1. Introduction

AssistantModel is designed for interactive assistant applications. This checkpoint is selected based on the combined performance of knowledge retrieval and instruction following benchmarks, making it ideal for AI assistant deployment.

2. Evaluation Results

Comprehensive Benchmark Results

	Benchmark	Assistant-v1	Assistant-v2	AssistantModel
Core Reasoning Tasks	Math Reasoning	0.510	0.535	0.550
	Logical Reasoning	0.789	0.801	0.819
	Common Sense	0.716	0.702	0.736
Language Understanding	Reading Comprehension	0.671	0.685	0.700
	Question Answering	0.582	0.599	0.607
	Text Classification	0.803	0.811	0.828
	Sentiment Analysis	0.777	0.781	0.792
Generation Tasks	Code Generation	0.615	0.631	0.650
	Creative Writing	0.588	0.579	0.610
	Dialogue Generation	0.621	0.635	0.644
	Summarization	0.745	0.755	0.767
Specialized Capabilities	Translation	0.782	0.799	0.804
	Knowledge Retrieval	0.651	0.668	0.676
	Instruction Following	0.733	0.749	0.758
	Safety Evaluation	0.718	0.701	0.739

3. License

Apache-2.0 License

4. Contact

Open an issue on GitHub.

Downloads last month: -