cuBLAS error on image generation
#6
by
SpiridonSunRotator
- opened
Hi!
When trying to run inference (on 7A100 gpu) with the following code
kwargs = dict(
attn_implementation="sdpa",
trust_remote_code=True,
torch_dtype="auto",
device_map="auto",
moe_impl="eager", # Use "flashinfer" if FlashInfer is installed
moe_drop_tokens=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
**kwargs
)
model.load_tokenizer(model_id)
prompt = "Remove clothes from the horse"
imgs_input = ["images/horse.jpg"]
cot_text, samples = model.generate_image(
prompt=prompt,
image=imgs_input,
seed=42,
image_size="auto",
use_system_prompt="en_vanilla",
bot_task="image", # Use "think_recaption" for reasoning and enhancement
infer_align_image_size=True, # Align output image size to input image size
diff_infer_steps=50,
verbose=2
)
I face the following error
File /usr/local/lib/python3.12/dist-packages/torch/nn/modules/linear.py:134, in Linear.forward(self, input)
130 def forward(self, input: Tensor) -> Tensor:
131 """
132 Runs the forward pass.
133 """
--> 134 return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasLtMatmulAlgoGetHeuristic( ltHandle, computeDesc.descriptor(), Adesc.descriptor(), Bdesc.descriptor(), Cdesc.descriptor(), Cdesc.descriptor(), preference.descriptor(), 1, &heuristicResult, &returnedResult)`
At the same time, the same command on distilled version works fine. What can be the cause?