Image Dimension Test Results

#27

by Koitenshin - opened Apr 27

Apr 27

Google says you can use any image resolution. I decided to test their claims.

I made 3 instances & tested a few images in the following resolutions utilizing the exact same settings including Seed number:
1080 x 1920 (100%)
810 x 1440 (75%)
540 x 960 (50%)

The 100% & 75% images gave incomplete descriptions, but the images resized to 50% gave a complete description.

Was the image encoder only trained with a maximum resolution of 1024^2?

If anyone else tests Gemma4 this way, please include your results as well.

lakshmikala

May 4

Hi @Koitenshin , Great observation and thanks for testing this. We have verified the responses with specified three resolutions on gemma-4-E4B-it model. The model provides complete descriptions for all three resolutions . Please refer to this sample gist. However to investigate your findings more clearly could you share a reproducible code along token budget limits and the prompt given. Thank You

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment