Mistral-7B BitNet LoRA — 4-Bit Merged

This repository contains a 4-bit quantized and LoRA-merged version of mistralai/Mistral-7B-Instruct-v0.2, fine-tuned on a small Q&A dataset related to 1-bit LLMs and BitNet B1.58.

The LoRA adapter was merged with the base model for easier deployment on constrained hardware or Hugging Face Spaces.


Model Details

  • Base model: mistralai/Mistral-7B-Instruct-v0.2
  • Quantization: 4-bit (NF4 via bitsandbytes)
  • Fine-tuning: LoRA (merged into base)
  • Adapter repo: ogflash/mistral-lora-qa-1bit

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

model_id = "ogflash/mistral-merged-1bit-4bit"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", quantization_config=bnb_config)

inputs = tokenizer("### Instruction:\nWhat is BitNet B1.58?\n\n### Response:\n", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
1
Safetensors
Model size
7B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using ogflash/merged-mistral-4bit-bitnetQnA 1