DeepSeek-R1 / README.md
mcuste
Update Conversion Artifacts and README.md with conversion numbers and links to documentation
e8d6cb1
---
base_model:
- opensourcerelease/DeepSeek-R1-bf16
base_model_relation: quantized
pipeline_tag: text-generation
tags:
- quantized
- hardware-optimized
- deepseek
- tensordyne
license: mit
---
## πŸ“ Overview
Tensordyne builds advanced [AI-inference systems](https://www.tensordyne.ai/inference-system), enabling faster, more affordable, and sustainable generative AI.
This repository provides resources to quickly get started with **[DeepSeek-R1](https://huggingface.co/opensourcerelease/DeepSeek-R1-bf16)** on the **Tensordyne Inference System and its SDK**.
## 🧩 Model Details
- **Quantization:** post-training quantization of the base model, no fine-tuning or additional training was performed
- **Supported data types:** Tensordyne FP16 (tFP16), Tensordyne FP8 (tFP8), mixed-precision
## βš™οΈ Quantization
The Tensordyne SDK offers multiple post-training quantization strategies to convert AI models for efficient inference on the Tensordyne Inference System β€” fully customizable for your optimization targets.
We showcase several preselected quantization variants that can be applied on-the-fly to quantize to Tensordyne data types here. The calibration-based strategies are defined by quantization configurations provided as `.json`.
The quantized models are evaluated on 6% of the [WikiText-2 raw v1](https://huggingface.co/datasets/Salesforce/wikitext) test set. Negative relative perplexity drops indicate that the model performs better than the float base model.
| Model Configuration | Absolute Perplexity | Relative Perplexity Drop vs. BF16 | Details |
|----------------------------------|---------------------|----------------------------------------|-------------------------------------------------------------|
| BF16 | 2.202 | – | The baseline model trained in BF16 |
| calibration_free_tFP16 | 2.189 | -0.57 % | calibration-free tFP16 quantization |
## πŸš€ Getting Started
Refer to the [Tensordyne Hugging Face Hub tutorial](https://resources.tensordyne.ai/sdk/v0.1.1/tutorials/tutorials/#tensordyne-hugging-face-hub-tutorials) for instructions on using the artifacts provided in this repository.
Our [hosted documentation](https://resources.tensordyne.ai/sdk/v0.1.1/) provides more information on Tensordyne's quantization strategies and introduces you to our SDK.