You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Velvet-2B (4-bit Quantized)

This is a 4-bit quantized version of Almawave/Velvet-2B using bitsandbytes.

Model Details

Base Model: Almawave/Velvet-2B
Quantization: 4-bit (nf4) with bitsandbytes
Compute Dtype: float16
Double Quantization: Enabled

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

model_id = "Obiactum/Velvet-2B-4bit"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

Benefits

Reduced memory usage (approximately 75% reduction)
Faster inference on compatible hardware
Maintains reasonable model performance

license: Apache 2.0 license
base_model: Almawave/Velvet-2B
tags:
- 4bit
- bitsandbytes
- quantized
- Velvet-2B

Downloads last month: 87

Safetensors

Model size

2B params

Tensor type

F32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Obiactum/Velvet-2B-4bit

Base model

Almawave/Velvet-2B

Quantized

(3)

this model