pixel_values input don't have the good shape to work in your differents exemples

#1
by pd-intrasec - opened

The pixel_values shape is set as float32[batch_size,num_channels,height,width,512] which is weird
If i change the image shape as [1, 1, 3, 512, 512] i got this error :

Input shapes:
pixel_values: (1, 1, 3, 512, 512)
input_ids: (1, 7)
attention_mask: (1, 7)
❌ Example failed: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Split node. Name:'/model/vision_model/embeddings/Split' Status Message: Cannot split using values in 'split' attribute. Axis=0 Input shape={1,32,32} NumOutputs=2 Num entries in 'split' (must equal number of outputs) was 2 Sum of sizes in 'split' (must equal size of selected axis) was 2

Lamco Dvelopment org

Hello, Yes I have discovered 2 things: 1. My testing wasn't nearly as thorough as I believed; and 2. Even when configured correctly (and I'd now contend my version isn't), the solution I've offered, especially in CPU-only usage, is practically unworkable due to its speed. I'm working on a revised solution and should provide it this afternoon my time after I go to an appointment around 1.

Stay tuned, and thanks for your patience.

Hi there πŸ‘‹ Nice to see you gave it a go πŸ˜€ This export is indeed quite tricky.

I finally got some time to work on this, so I spent the past couple of hours making a conversion: https://huggingface.co/onnx-community/granite-docling-258M-ONNX. I've added sample code to the README. Let me know if it works well for you!

Sign up or log in to comment