Different Behavior between this hf model and the local model

by QHQK - opened Oct 20, 2024

Oct 20, 2024

Hi, thanks for your excellent job.

When testing the grounding dino with the provided "how to use" code, I change the input text from "a cat. a remote control." to "a cat. a remote control. the right cat". I find that the grounding-dino(hf) tends to match all detected cat boxes to the phrase "a cat," while the locally installed code with "groundingdino_swinb_cogcoor.pth" always matches the box of the right cat to the phrase "the right cat" and ignore "a cat."

I'm curious whether these two base models are the same. If they are different, which one will perform better?

EduardoPacheco

Oct 22, 2024

Hey @QHQK ! Are you using .post_process_grounded_object_detection

tlpss

Sep 23, 2025

•

edited Sep 23, 2025

~~I'm having similar issues! I basically copied the script from the model card, so it does include the postprocessing.~~

Ok no sorry, my bad. The difference was that I had only a single object category for the original checkpoint and two different categories for the HF-hub model, and apparently, that changes the output ( the confidence scores, the box proposals seem to be the same).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment