INT8-H16P / README.md
Omdano's picture
Upload README.md with huggingface_hub
8eada2c verified
# DinoV3 Vision Transformer Huge (INT8 Quantized)
INT8 quantized version of `facebook/dinov3-vith16plus-pretrain-lvd1689m` using BitsAndBytes.
## Model Details
- **Base Model**: DinoV3 Vision Transformer Huge (840M parameters)
- **Quantization**: INT8 weight-only quantization via BitsAndBytes
- **Size**: ~845MB (from ~1.7GB original)
- **Compression**: ~2x size reduction
- **Accuracy Loss**: <1% typical
## Usage
```python
from transformers import AutoModel, BitsAndBytesConfig
# Load the INT8 quantized model
model = AutoModel.from_pretrained(
"Omdano/INT8-H16P",
trust_remote_code=True,
quantization_config=BitsAndBytesConfig(load_in_8bit=True),
device_map="auto"
)
# Use for feature extraction or classification
outputs = model(pixel_values=inputs)
```
## Benefits
- **2x smaller** than full precision model
- **Faster inference** on GPU
- **Same API** as original DinoV3
- **Minimal accuracy loss** (<1%)
## Requirements
```bash
pip install transformers bitsandbytes torch
```
## Original Model
Based on [facebook/dinov3-vith16plus-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vith16plus-pretrain-lvd1689m)
## License
Apache 2.0 (same as original DinoV3)