This is a 4-bit quantized version of Phi-3 4k Instruct. Quantization done with: ``` bnb_config = BitsAndBytesConfig( load_in_4bit = True, bnb_4bit_use_double_quant = True, bnb_4bit_quant_type = 'nf4', bnb_4bit_compute_dtype = torch.bfloat16 ) model = AutoModelForCausalLM.from_pretrained( foundation_model_name, device_map = 'auto', quantization_config = bnb_config, trust_remote_code = True ) ```