Biomni-R0-32B-INT4-to-BF16 (Bridge Model)

This is a dequantized BF16 version of the AWQ INT4 Biomni model. It has been converted back from INT4 → BF16 format.

Purpose

This "Bridge Model" serves several purposes:

  1. Fine-tuning base: Use as a starting point for LoRA or full fine-tuning
  2. Research: Study quantization/dequantization quality recovery
  3. Compatibility: Run on hardware without INT4/FP8 support

Dequantization Details

Parameter Value
Source Biomni-R0-32B-AWQ-INT4-CustomCalib
Target Dtype BFloat16
Method Standard AWQ unpacking (W4A16)
Group Size 128

Important Notes

⚠️ This model is NOT identical to the original BF16 model.

The dequantization process recovers an approximation:

  • W_bf16_recovered = W_int4 × Scale
  • Some precision loss is expected from the quantization → dequantization roundtrip
  • Best used for fine-tuning where the loss can be recovered

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "hassanshka/Biomni-R0-32B-INT4-to-BF16",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("hassanshka/Biomni-R0-32B-INT4-to-BF16")

# Use for inference or fine-tuning

Dequantization Process

The model was dequantized using the following algorithm:

def unpack_awq_standard(packed_weight, scales):
    group_size = 128
    scales_expanded = scales.repeat_interleave(group_size, dim=1)
    
    packed_weight = packed_weight.to(torch.int32)
    unpacked_cols = []
    mask = 0xF
    
    for i in range(8):
        weight_chunk = (packed_weight >> (i * 4)) & mask
        weight_chunk = torch.where(weight_chunk >= 8, weight_chunk - 16, weight_chunk)
        unpacked_cols.append(weight_chunk)
    
    weights = torch.stack(unpacked_cols, dim=-1)
    weights = weights.view(rows, packed_cols * 8)
    
    dequantized = weights.to(torch.bfloat16) * scales_expanded.to(torch.bfloat16)
    return dequantized

License

Apache 2.0 (same as base model)

Citation

If you use this model, please cite the original Biomni model.

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hassanshka/Biomni-R0-32B-INT4-to-BF16

Base model

Qwen/Qwen3-32B
Finetuned
(2)
this model