Velvet-2B-4bit / README.md
Obiactum's picture
Update README.md
7c92e77 verified
---
license: apache-2.0
language:
- it
- en
base_model:
- Almawave/Velvet-2B
---
# Velvet-2B (4-bit Quantized)
This is a 4-bit quantized version of [Almawave/Velvet-2B](https://huggingface.co/Almawave/Velvet-2B) using bitsandbytes.
## Model Details
- **Base Model**: Almawave/Velvet-2B
- **Quantization**: 4-bit (nf4) with bitsandbytes
- **Compute Dtype**: float16
- **Double Quantization**: Enabled
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
model_id = "Obiactum/Velvet-2B-4bit"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
```
## Benefits
- Reduced memory usage (approximately 75% reduction)
- Faster inference on compatible hardware
- Maintains reasonable model performance
```
license: Apache 2.0 license
base_model: Almawave/Velvet-2B
tags:
- 4bit
- bitsandbytes
- quantized
- Velvet-2B
```