cuBLAS error on image generation

Jan 30

Hi!

When trying to run inference (on 7A100 gpu) with the following code

kwargs = dict(
    attn_implementation="sdpa", 
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto",
    moe_impl="eager",   # Use "flashinfer" if FlashInfer is installed
    moe_drop_tokens=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    **kwargs
)
model.load_tokenizer(model_id)

prompt = "Remove clothes from the horse"
imgs_input = ["images/horse.jpg"]

cot_text, samples = model.generate_image(
    prompt=prompt,
    image=imgs_input,
    seed=42,
    image_size="auto",
    use_system_prompt="en_vanilla",
    bot_task="image",  # Use "think_recaption" for reasoning and enhancement
    infer_align_image_size=True,  # Align output image size to input image size
    diff_infer_steps=50, 
    verbose=2
)

I face the following error

File /usr/local/lib/python3.12/dist-packages/torch/nn/modules/linear.py:134, in Linear.forward(self, input)
    130 def forward(self, input: Tensor) -> Tensor:
    131     """
    132     Runs the forward pass.
    133     """
--> 134     return F.linear(input, self.weight, self.bias)

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasLtMatmulAlgoGetHeuristic( ltHandle, computeDesc.descriptor(), Adesc.descriptor(), Bdesc.descriptor(), Cdesc.descriptor(), Cdesc.descriptor(), preference.descriptor(), 1, &heuristicResult, &returnedResult)`

At the same time, the same command on distilled version works fine. What can be the cause?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment