SentenceTransformer

#2
by sajozsattila - opened

I try to use this model to SentenceTransformer like:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("mlx-community/Qwen3-Embedding-4B-4bit-DWQ")

sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)

However I got this error:

File ~/miniconda3/envs/ds/lib/python3.11/site-packages/transformers/quantizers/auto.py:244, in AutoHfQuantizer.supports_quant_method(quantization_config_dict)
    242     quant_method = QuantizationMethod.BITS_AND_BYTES + suffix
    243 elif quant_method is None:
--> 244     raise ValueError(
    245         "The model's quantization config from the arguments has no `quant_method` attribute. Make sure that the model has been correctly quantized"
    246     )
    248 if quant_method not in AUTO_QUANTIZATION_CONFIG_MAPPING:
    249     logger.warning(
    250         f"Unknown quantization type, got {quant_method} - supported types are:"
    251         f" {list(AUTO_QUANTIZER_MAPPING.keys())}. Hence, we will skip the quantization. "
    252         "To remove the warning, you can delete the quantization_config attribute in config.json"
    253     )

ValueError: The model's quantization config from the arguments has no `quant_method` attribute. Make sure that the model has been correctly quantized

Ok, so it says I need to set the quant_method. I can try like this:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer(
    "mlx-community/Qwen3-Embedding-4B-4bit-DWQ",
    config_kwargs={
        "quantization_config": {
            "group_size": 64,
            "bits": 4,
            "quant_method": "what_should_be???"
        }
    }
)

sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)

However, I do not see what should I use for DWQ in this case. It list the following options:

Unknown quantization type, got what_should_be??? - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'quark', 'fp_quant', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq', 'spqr', 'fp8', 'auto-round', 'mxfp4'].

Any idea how I should set up for SentenceTransformer on Apple M4 Pro?

MLX Community org

this is model for mlx framework so it won't work with SentenceTransformer

mzbac changed discussion status to closed

Sign up or log in to comment