Biomni-R0-32B-INT4-to-BF16 (Bridge Model)

This is a dequantized BF16 version of the AWQ INT4 Biomni model. It has been converted back from INT4 → BF16 format.

Purpose

This "Bridge Model" serves several purposes:

  1. Fine-tuning base: Use as a starting point for LoRA or full fine-tuning
  2. Research: Study quantization/dequantization quality recovery
  3. Compatibility: Run on hardware without INT4/FP8 support

Dequantization Details

Parameter Value
Source Biomni-R0-32B-AWQ-INT4-CustomCalib
Target Dtype BFloat16
Method Standard AWQ unpacking (W4A16)
Group Size 128

Important Notes

⚠️ This model is NOT identical to the original BF16 model.

The dequantization process recovers an approximation:

  • W_bf16_recovered = W_int4 × Scale
  • Some precision loss is expected from the quantization → dequantization roundtrip
  • Best used for fine-tuning where the loss can be recovered

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "hassanshka/Biomni-R0-32B-INT4-to-BF16",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("hassanshka/Biomni-R0-32B-INT4-to-BF16")

# Use for inference or fine-tuning

Dequantization Process

The model was dequantized using the following algorithm:

def unpack_awq_standard(packed_weight, scales):
    group_size = 128
    scales_expanded = scales.repeat_interleave(group_size, dim=1)
    
    packed_weight = packed_weight.to(torch.int32)
    unpacked_cols = []
    mask = 0xF
    
    for i in range(8):
        weight_chunk = (packed_weight >> (i * 4)) & mask
        weight_chunk = torch.where(weight_chunk >= 8, weight_chunk - 16, weight_chunk)
        unpacked_cols.append(weight_chunk)
    
    weights = torch.stack(unpacked_cols, dim=-1)
    weights = weights.view(rows, packed_cols * 8)
    
    dequantized = weights.to(torch.bfloat16) * scales_expanded.to(torch.bfloat16)
    return dequantized

License

Apache 2.0 (same as base model)

Citation

If you use this model, please cite the original Biomni model.

Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hassanshka/Biomni-R0-32B-INT4-to-BF16

Base model

Qwen/Qwen3-32B
Finetuned
(2)
this model