--- license: mit library_name: transformers base_model: - deepseek-ai/DeepSeek-V3-0324 - deepseek-ai/DeepSeek-R1 - deepseek-ai/DeepSeek-R1-0528 pipeline_tag: text-generation --- # DeepSeek-TNG-R1T2-Chimera

**Model Merge of DeepSeek-R1-0528, DeepSeek-R1 and DeepSeek-V3-0324** An open weights model combining the intelligence of R1-0528 and R1 with the token efficiency of V3. For details on the construction process, which is an extension of that for the original Chimera model, please [read our paper](https://arxiv.org/abs/2506.14794). [Paper on arXiV](https://arxiv.org/abs/2506.14794) | [Announcement on X](https://x.com/tngtech/status/1916284566127444468) | [LinkedIn post](https://www.linkedin.com/posts/tng-technology-consulting_on-the-weekend-we-released-deepseek-r1t-chimera-activity-7323008947236290560-Cf2m)) ## Model Details - **Architecture**: DeepSeek-MoE transformer-based language model - **Combination Method**: Merged model weights from DeepSeek-R1-0528, DeepSeek-R1 and DeepSeek-V3-0324 - **Release Date**: 2025-07-0x ## Use, Out-of-scope Use, Limitations, Risks, Recommendations et al. Regarding R1T2-Chimera, we ask you to follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model. These guidelines are available [here on Hugging Face](https://huggingface.co/microsoft/MAI-DS-R1). ## Contact - Email: research@tngtech.com - X.com: @tngtech ## Citation ``` @misc{tng_technology_consulting_gmbh_2025_07_0x, author = { TNG Technology Consulting GmbH }, title = { DeepSeek-TNG-R1T2-Chimera }, year = 2025, month = { April }, url = { https://huggingface.co/tngtech/DeepSeek-TNG-R1T2-Chimera }, doi = { xxx }, publisher = { Hugging Face } } ```