Mistral-7B BitNet LoRA — 4-Bit Merged
This repository contains a 4-bit quantized and LoRA-merged version of mistralai/Mistral-7B-Instruct-v0.2, fine-tuned on a small Q&A dataset related to 1-bit LLMs and BitNet B1.58.
The LoRA adapter was merged with the base model for easier deployment on constrained hardware or Hugging Face Spaces.
Model Details
- Base model:
mistralai/Mistral-7B-Instruct-v0.2 - Quantization: 4-bit (NF4 via bitsandbytes)
- Fine-tuning: LoRA (merged into base)
- Adapter repo: ogflash/mistral-lora-qa-1bit
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
model_id = "ogflash/mistral-merged-1bit-4bit"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", quantization_config=bnb_config)
inputs = tokenizer("### Instruction:\nWhat is BitNet B1.58?\n\n### Response:\n", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support