How to use OpenGVLab/InternVL-14B-224px with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-feature-extraction", model="OpenGVLab/InternVL-14B-224px", trust_remote_code=True)
# Load model directly from transformers import AutoProcessor, AutoModel processor = AutoProcessor.from_pretrained("OpenGVLab/InternVL-14B-224px", trust_remote_code=True) model = AutoModel.from_pretrained("OpenGVLab/InternVL-14B-224px", trust_remote_code=True)
为什么实际测试发现该模型无法很好的衡量文本和图片的相似度?
注意前缀 'summarize:' 和 tokenizer.pad_token_id = 0 不能少。
能否提供您的测试代码,看能否复现问题
· Sign up or log in to comment