|
|
--- |
|
|
base_model: |
|
|
- openai/clip-vit-large-patch14 |
|
|
base_model_relation: quantized |
|
|
pipeline_tag: zero-shot-image-classification |
|
|
tags: |
|
|
- quantized |
|
|
- hardware-optimized |
|
|
- clip |
|
|
- vision |
|
|
- tensordyne |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
## π Overview |
|
|
Tensordyne builds advanced [AI-inference systems](https://www.tensordyne.ai/inference-system), enabling faster, more affordable, and sustainable generative AI. |
|
|
|
|
|
This repository provides resources to quickly get started with **[CLIP-vit-large](https://huggingface.co/openai/clip-vit-large-patch14)** on the **Tensordyne Inference System and its SDK**. |
|
|
|
|
|
## π§© Model Details |
|
|
- **Quantization:** post-training quantization of the base model, no fine-tuning or additional training was performed |
|
|
- **Supported data types:** Tensordyne FP16 (tFP16), Tensordyne FP8 (tFP8), mixed-precision |
|
|
|
|
|
## βοΈ Quantization |
|
|
The Tensordyne SDK offers multiple post-training quantization strategies to convert AI models for efficient inference on the Tensordyne Inference System β fully customizable for your optimization targets. |
|
|
We showcase several preselected quantization variants that can be applied on-the-fly to quantize to Tensordyne data types here. The calibration-based strategies are defined by quantization configurations provided as `.json`. |
|
|
|
|
|
The quantized models are evaluated on a subset of the [imagenet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) test set. Negative relative accuracy drops indicate that the model performs better than the float base model. |
|
|
|
|
|
| Model Configuration | Top-1 Accuracy [%] | Relative Top-1 Accuracy Drop vs. IEEE FP32 | Details | |
|
|
|----------------------------|-------------------------|--------------------------------------------|-------------------------------------------------------------| |
|
|
| IEEE FP32 | 71.36 | β | The baseline model trained in IEEE FP32 | |
|
|
| calibration_based_tFP16 | 71.34 | 0.02 % | calibration-based tFP16 quantization | |
|
|
| layerwise_mixed_precision | 71.20 | 0.22 % | calibration-based mixed-precision: tFP8, outliers in tFP16 | |
|
|
|
|
|
## π Getting Started |
|
|
Refer to the [Tensordyne Hugging Face Hub tutorial](https://resources.tensordyne.ai/sdk/v0.1.1/tutorials/tutorials/#tensordyne-hugging-face-hub-tutorials) for instructions on using the artifacts provided in this repository. |
|
|
Our [hosted documentation](https://resources.tensordyne.ai/sdk/v0.1.1/) provides more information on Tensordyne's quantization strategies and introduces you to our SDK. |
|
|
|