Alessamo's picture
Update README.md
c3ebe38 verified
---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-7B
tags:
- capability-tagging
- cognition
- qwen
---
# Model Card for CDT-Cognition-Tagger
This model is a key component of the **Cognition-Domain-Task (CDT) framework**, a comprehensive capability framework for Large Language Models presented in our paper CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task. It is specifically fine-tuned to classify a given instruction into one of 18 cognitive abilities defined by the CDT framework.
## Model Details
### Model Description
The Cognition dimension of the CDT framework is inspired by the **Cattell-Horn-Carroll (CHC) theory** of cognitive abilities, adapted for the context of LLMs. This model analyzes an instruction and identifies the primary cognitive skills required to fulfill it.
- **Model type:** Qwen2ForCausalLM
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model:** Qwen2.5-7B-Base
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/Alessa-mo/CDT
- **Paper Link:** https://arxiv.org/abs/2509.24422
### Basic Usage
Please refer to https://github.com/Alessa-mo/CDT. You can run the following scripts to tag the cognition labels.
```bash
cd tag_annotate
export CUDA_VISIBLE_DEVICES=0
python annotate.py \
--data_path path/to/your/data \
--output_dir path/to/output/dir \
--model_path CDT-Cognition-Tagger \
--prompt_file ./prompt/annotation_prompt.jsonl \
--cognition_skill_file ./prompt/cognition.json \
--domain_skill_file ./prompt/domain.json \
--task_skill_file ./prompt/task.json \
--tag_type "cognition" \
--batch_size 32
```
**Note**: Make sure your data is a JSON file and has the following format:
```json
[
{
"messages": [
{
"role": "user",
"content": "xxxx"
},
{
"role": "assistant",
"content": "xxxx"
}
]
},
]
```
## Citation
If you find this model useful, please cite:
```bash
@misc{mo2025cdtcomprehensivecapabilityframework,
title={CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task},
author={Haosi Mo and Xinyu Ma and Xuebo Liu and Derek F. Wong and Yu Li and Jie Liu and Min Zhang},
year={2025},
eprint={2509.24422},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2509.24422},
}
```