Image Dimension Test Results

#27
by Koitenshin - opened

Google says you can use any image resolution. I decided to test their claims.

I made 3 instances & tested a few images in the following resolutions utilizing the exact same settings including Seed number:
1080 x 1920 (100%)
810 x 1440 (75%)
540 x 960 (50%)

The 100% & 75% images gave incomplete descriptions, but the images resized to 50% gave a complete description.

Was the image encoder only trained with a maximum resolution of 1024^2?

If anyone else tests Gemma4 this way, please include your results as well.

Hi @Koitenshin , Great observation and thanks for testing this. We have verified the responses with specified three resolutions on gemma-4-E4B-it model. The model provides complete descriptions for all three resolutions . Please refer to this sample gist. However to investigate your findings more clearly could you share a reproducible code along token budget limits and the prompt given. Thank You

Sign up or log in to comment