SentenceTransformer
#2
by sajozsattila - opened
I try to use this model to SentenceTransformer like:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("mlx-community/Qwen3-Embedding-4B-4bit-DWQ")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
However I got this error:
File ~/miniconda3/envs/ds/lib/python3.11/site-packages/transformers/quantizers/auto.py:244, in AutoHfQuantizer.supports_quant_method(quantization_config_dict)
242 quant_method = QuantizationMethod.BITS_AND_BYTES + suffix
243 elif quant_method is None:
--> 244 raise ValueError(
245 "The model's quantization config from the arguments has no `quant_method` attribute. Make sure that the model has been correctly quantized"
246 )
248 if quant_method not in AUTO_QUANTIZATION_CONFIG_MAPPING:
249 logger.warning(
250 f"Unknown quantization type, got {quant_method} - supported types are:"
251 f" {list(AUTO_QUANTIZER_MAPPING.keys())}. Hence, we will skip the quantization. "
252 "To remove the warning, you can delete the quantization_config attribute in config.json"
253 )
ValueError: The model's quantization config from the arguments has no `quant_method` attribute. Make sure that the model has been correctly quantized
Ok, so it says I need to set the quant_method. I can try like this:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"mlx-community/Qwen3-Embedding-4B-4bit-DWQ",
config_kwargs={
"quantization_config": {
"group_size": 64,
"bits": 4,
"quant_method": "what_should_be???"
}
}
)
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
However, I do not see what should I use for DWQ in this case. It list the following options:
Unknown quantization type, got what_should_be??? - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'quark', 'fp_quant', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq', 'spqr', 'fp8', 'auto-round', 'mxfp4'].
Any idea how I should set up for SentenceTransformer on Apple M4 Pro?
this is model for mlx framework so it won't work with SentenceTransformer
mzbac changed discussion status to closed