| | --- |
| | license: mit |
| | library_name: transformers |
| | base_model: |
| | - deepseek-ai/DeepSeek-V3-0324 |
| | - deepseek-ai/DeepSeek-R1 |
| | - deepseek-ai/DeepSeek-R1-0528 |
| | pipeline_tag: text-generation |
| | --- |
| | # DeepSeek-TNG-R1T2-Chimera |
| |
|
| | <div align="center"> |
| | <img src="https://354918363417-runtime-assets.s3.eu-central-1.amazonaws.com/company_logo_light.svg" |
| | alt="TNG Logo" |
| | width="400" |
| | style="display: inline-block; vertical-align: middle;"/> |
| | </div> |
| | <br> |
| | <div align="center"> |
| | <a href="LICENSE" style="margin: 2px;"> |
| | <img alt="License" src="https://img.shields.io/badge/License-MIT-f5de53?&color=f5de53" style="display: inline-block; vertical-align: middle;"/> |
| | </a> |
| | </div> |
| | <br> |
| | <div align="center"> |
| | <a href="https://x.com/tngtech/status/1916284566127444468" style="margin: 2px;"> |
| | <img alt="Benchmarks" src="R1T-Chimera_Benchmarks_20250427_V1.jpg" style="display: inline-block; vertical-align: middle;"/> |
| | </a> |
| | </div> |
| | |
| | **Model Merge of DeepSeek-R1-0528, DeepSeek-R1 and DeepSeek-V3-0324** |
| |
|
| | An open weights model combining the intelligence of R1-0528 and R1 with the token efficiency of V3. |
| |
|
| | For details on the construction process, which is an extension of that for the original Chimera model, please [read our paper](https://arxiv.org/abs/2506.14794). |
| |
|
| | [Paper on arXiV](https://arxiv.org/abs/2506.14794) | [Announcement on X](https://x.com/tngtech/status/1916284566127444468) | [LinkedIn post](https://www.linkedin.com/posts/tng-technology-consulting_on-the-weekend-we-released-deepseek-r1t-chimera-activity-7323008947236290560-Cf2m)) |
| |
|
| |
|
| | ## Model Details |
| |
|
| | - **Architecture**: DeepSeek-MoE transformer-based language model |
| | - **Combination Method**: Merged model weights from DeepSeek-R1-0528, DeepSeek-R1 and DeepSeek-V3-0324 |
| | - **Release Date**: 2025-07-0x |
| |
|
| | ## Use, Out-of-scope Use, Limitations, Risks, Recommendations et al. |
| | Regarding R1T2-Chimera, we ask you to follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model. |
| |
|
| | These guidelines are available [here on Hugging Face](https://huggingface.co/microsoft/MAI-DS-R1). |
| |
|
| | ## Contact |
| |
|
| | - Email: research@tngtech.com |
| | - X.com: @tngtech |
| |
|
| | ## Citation |
| |
|
| | ``` |
| | @misc{tng_technology_consulting_gmbh_2025_07_0x, |
| | author = { TNG Technology Consulting GmbH }, |
| | title = { DeepSeek-TNG-R1T2-Chimera }, |
| | year = 2025, |
| | month = { April }, |
| | url = { https://huggingface.co/tngtech/DeepSeek-TNG-R1T2-Chimera }, |
| | doi = { xxx }, |
| | publisher = { Hugging Face } |
| | } |
| | ``` |