Image-Text-to-Text
Transformers
ONNX
Safetensors
English
idefics3
image-to-text
conversational

SFTTrainer Error

#12
by badhon1512 - opened

Hi,

While fine-tuning SmolVLM with SFTTrainer using the standard data format, I’m getting this error:

raise ValueError(
ValueError: Mismatch in image token count between text and input_ids. Got ids=[883, 897, 899, 938] and text=[1377, 1377, 1377, 1377]. Likely due to truncation='max_length'. Please disable truncation or increase max_length.

I already tried increasing max_length, but the error persists. Any assistance would be greatly appreciated.

Sign up or log in to comment