Update preprocessor_config.json

#2
by GoGiants1 - opened

Summary

This PR fixes a configuration mismatch between config.json and preprocessor_config.json that causes model loading to fail in vLLM.

Problem

The model’s vision_config (in config.json) expects image inputs of size 384×384, while the image preprocessor is configured with a different value (364) in preprocessor_config.json.
As a result, vLLM raises a shape mismatch error during model loading:

ValueError: pixel_values dim[2] expected 384, got 364. Expected shape: ('bnp', 3, 384, 384), but got torch.Size([17, 3, 364, 364])

Root Cause

  • config.jsonvision_config expects image_size = 384
  • preprocessor_config.jsonmax_image_size.longest_edge = 364

This inconsistency leads to incorrect resizing in preprocessing and a downstream tensor shape mismatch.

Fix

This PR updates preprocessor_config.json to align with the model’s vision configuration:

"max_image_size": {
-  "longest_edge": 364
+  "longest_edge": 384
}
"size": {
-    "longest_edge": 1456
+    "longest_edge": 1920
  }
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment