LukasRRecogni

update README with accuracy numbers and correct links

cf45c57 2 months ago

2.77 kB

	---
	base_model:
	- openai/clip-vit-large-patch14
	base_model_relation: quantized
	pipeline_tag: zero-shot-image-classification
	tags:
	- quantized
	- hardware-optimized
	- clip
	- vision
	- tensordyne
	license: apache-2.0
	---

	## 📝 Overview
	Tensordyne builds advanced [AI-inference systems](https://www.tensordyne.ai/inference-system), enabling faster, more affordable, and sustainable generative AI.

	This repository provides resources to quickly get started with [CLIP-vit-large](https://huggingface.co/openai/clip-vit-large-patch14) on the Tensordyne Inference System and its SDK.

	## 🧩 Model Details
	- Quantization: post-training quantization of the base model, no fine-tuning or additional training was performed
	- Supported data types: Tensordyne FP16 (tFP16), Tensordyne FP8 (tFP8), mixed-precision

	## ⚙️ Quantization
	The Tensordyne SDK offers multiple post-training quantization strategies to convert AI models for efficient inference on the Tensordyne Inference System — fully customizable for your optimization targets.
	We showcase several preselected quantization variants that can be applied on-the-fly to quantize to Tensordyne data types here. The calibration-based strategies are defined by quantization configurations provided as `.json`.

	The quantized models are evaluated on a subset of the [imagenet-1k](https://huggingface.co/datasets/ILSVRC/imagenet-1k) test set. Negative relative accuracy drops indicate that the model performs better than the float base model.

	\| Model Configuration \| Top-1 Accuracy [%] \| Relative Top-1 Accuracy Drop vs. IEEE FP32 \| Details \|
	\|----------------------------\|-------------------------\|--------------------------------------------\|-------------------------------------------------------------\|
	\| IEEE FP32 \| 71.36 \| – \| The baseline model trained in IEEE FP32 \|
	\| calibration_based_tFP16 \| 71.34 \| 0.02 % \| calibration-based tFP16 quantization \|
	\| layerwise_mixed_precision \| 71.20 \| 0.22 % \| calibration-based mixed-precision: tFP8, outliers in tFP16 \|

	## 🚀 Getting Started
	Refer to the [Tensordyne Hugging Face Hub tutorial](https://resources.tensordyne.ai/sdk/v0.1.1/tutorials/tutorials/#tensordyne-hugging-face-hub-tutorials) for instructions on using the artifacts provided in this repository.
	Our [hosted documentation](https://resources.tensordyne.ai/sdk/v0.1.1/) provides more information on Tensordyne's quantization strategies and introduces you to our SDK.